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EXPRESSION 



37 C.F.R §1.74(d)/(e) Copyright Notice 
A portion of the disclosure of this patent document contains material which is 
10 subject to copyright protection. The copyright owner does not object to the 
reproduction by anyone of the patent document or the patent disclosure » as it 
appears in the Patent and Trademark Office patent files or records, but 
otherwise reserves ail copyright rights whatsoever. 

15 RELATED A PPUCATIONR 

This patent document is a Continuation-In-Part of U.S. Serial No. 07/977.691 
filed November 13, 1992, pending. This patent document is related to 
• THERAPEUTIC APPLICATION OF CHIMERIC ANTIBODY TO HUMAN B 
LYMPHOCYTE RESTRICTED DIFFEREMTATION ANTIGEN FOR 
20 TREATMENT OF B CELL LYMPHOMA." having U.S. Serial No. 07/978.891, 
filed November 13, 1992 (pending), and 'THERAPEUTIC APPLICATION OF 
CHIMERIC AND RADIOLABLED ANTIBODIES TO HUMAN B 
LYMPHOCYTE RESTRICTED DIFFERENTIATION ANTIGEN FOR 

TREATMENT OF B CELL LYMPHOMA", having U.S. Serial No. , filed 

25 simultaneously herewith. This patent document is related to commonly assigned 
United States Serial No. 07/912,292 and entitled "RECOMBINANT 
ANTIBODIES FOR HUMAN THERAPY." filed July 10. 1992. These documents 
are incorporated herein by reference. 
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As noted, the advent of the biotechnology industry has allowed for the 
production of large quantiti s of proteins. Proteins are the essential constituents 
of all living cells and proteins are comprised of combinations of 20 naturally 
occurring amino adds; each amino acid molecule is defined ("encoded") by 
5 groupings ("codone") of three deoxyribonucleic add ("DNA") molecules; a string of 
DNA molecules ("DNA macromolecule") provides, in essence, a blueprint for the 
production of specific sequences of amino acids specified by that blueprint. 
Intimately involved in this process is ribonucleic acid ("RNA"); three types of 
RNA (messenger RNA; transfer RNA; and ribosomal RNA) convert the 
10 information encoded by the DNA into, eg a protein. Thus, genetic information is 
generally transferred as follows: DNA RNA protein. 

In accordance with a typical strategy involving recombinant DNA technology, a 
DNA sequence which encodes a desired protein material ("cDNA") is identified 

15 and either isolated from a natural source or synthetically produced. By 

manipulating this piece of genetic material, the ends thereof are tailored to be 
ligated, or "fit," into a section of a small circular molecule of double stranded 
DNA. This circular molecule is typically referred to as a "DNA expression 
vector, " or simply a "vector." The combination of the vector and the genetic 

20 material can be referred to as a "plasmid" and the plasmid can be replicated in a 
prokaryotic host (ie bacterial in nature) as an autonomous circular DNA 
molecule as the prokaryotic host replicates. Thereafter, the circular DNA 
plasmid can be isolated and introduced into a eukaryotic host (ie mammalian in 
nature) and host cells which have incorporated the plasmid DNA are selected. 

25 While some plasmid vectors will replicate as an autonomous circular DNA 
molecule in mammalian cells, (eg plasmids comprising Epstein Barr virus 
("EBV") and Bovine Papilloma virus ("BPV") based vectors), most plasmids 
including DNA vectors, and all plasmids including RNA retroviral vectors, are 
integrated into the cellular DNA such that when the cellular DNA of the 
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eukaryotic host cell replicateSt the plasmid DNA will also replicate. 
Accordingly, as eukaryotic cells grow and divide, there is a corr spending 
increase in cells containing the integrated plasmid which leads to the production 
( 'exprassion'*) of the protein niAUhal of interest. By sutgecting the host cells 
5 containing the plasmid to favorable growth conditions, fiignificant amounts of the 
host, and hence the protein of interest, are produced. Typically, the Chinese 
Hamster Ovary C'CHO") cell line is utilized as a eukaryotic host cell, while E. coli 
is utilized as a prokaryotic host cell. 

10 The vector plays a crucial role in the foregoing • manipulation of the vector can 
allow for variability as to where the cDNA is inserted, means for determining 
whether the cDNA was, in fact, properly inserted within the vector, the 
conditions under which expression of the genetic material will or will not occur, 
etc. However, most of the vector manipulations are geared toward a single 

15 goal — ^increasing expression of a desired gene product, ie protein of interest. 
Stated again, most vector manipuJation is conducted so that an "improved" 
vector will allow for production of a gene product at significantly higher levels 
when compared to a "non-improved" vector. Thus, while certain of the 
features/aspects/characteristics of one vector may appear to be similar to the 

20 features/aspects/characteristics of another vector, it is often necessary to 

examine the result of the overall goal of the manipulation - improved production 
of a gene product of interest. 

While one "improved** vector may comprise characteristics which are desirable 
25 for one set of circumstances, these characteristics may not necessarily be 

desirable under other circumstances. However, one characteristic is desirable for 
all vectors: increased efficiency, ie the ability to increase the amount of protein 
of interest produced while at the same time decreasing the number of host cells 
to be screened which do not generate a sufficient amount of this protein. Such 
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increased efiSd ncy would have several desirable advantages, induding reducing 
manufacturing costs and decreasing the time spent by technicians in screening 
for viable colonies which are expressing the protein of interest. Accordingly, 
what would be desirable and what would significantly improve the state of the 
5 art are expression vectors with such efficiency characteristics. 



<^TJMMARY OF THE TNVENTIOK 



The invention disclosed herein satisfies these and other needs. Disclosed herein 
10 are fully impaired consensus Kozak sequences which are most typically used 

with dominant selectable markers of transcriptional cassettes which are a part of 
an expression vector, preferably, the dominant selectable marker comprises 
either a natural intronic insertion region or artificial intronic insertion region, 
and at least one gene product of interest is encoded by DNA located within such 
15 insertion region. 



As used herein, a ''dominant selectable marker" is a gene sequence or protein 
encoded by that gene sequence; expression of the protein encoded by the 
dominant selectable marker assures that a host cell tranafected with an 

20 expression vector which includes the dominant selectable marker will survive a 
selection process which would otherwise kill a host cell not containing this 
protein. As used herein, a "transcriptional cassette" is DNA encoding for a 
protein product {eg a dominant selectable marker) and the genetic elements 
necessary for production of the protein product in a host cell {ie promoter; 

25 transcription start site; polyadenylation region; etc.). These vectors are most 
preferably utilized in the expression of proteins in mammalian expression 
systems where integration of the vector into host cellular DNA occurs. 
Beneficially, the use of such fully impaired consensus Kozak sequences improves 
the efficiency of protein expression by significantly decreasing the number of 
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viable colonies while at the same time, significanUy increasing the amount ot 
protein expressed by such viabl colonies. As used herein, a "natural intronic 
insertion region" is a r gi n of DNA naturally present within a gene in this case, 
typically a dominant selectable marker, which can be utilized for insertion of 
5 DNA encoding a gene product of interest; an "artificial Intronic in»«rtion region ' 
is a region of DNA which is selectively created in a gene (again, most typically, a 
dominant selectable marker) which can be utilized for insertion of DNA encoding 
a gene product of interest. Information regarding intronic positioning is 
described in Abrams, J.M. et al. "Intronic Positioning Maximizes Co-expression 
10 and Co-amplification of Nonselectable Heterologous Genes. "J. Bio. Chem,, 
264124:14016 (1989), and U.S. Patent No. 5,043,270 (both documents being 
incorporated herein by reference). 

As defined, disclosed and claimed herein, a "fiilly impaired consensus Kozak" 
15 comprises the following sequence: 

-3 +1 
Pyxx ATG Pyxx 

20 where: "x" is a nucleotide selected firom the group consisting of adenine (A), 
quanine (G), cytosine (C) or thymine (TV uracil (U); "Py" is a pyrimidine 
nucleotide, ie C or T/U; "ATG" is a codon encoding for the amino add 
methionine, the so-called "start" codon; and -3 and +1 are directional reference 
points vis-a-vis ATG, ie -3 is meant to indicate three nucleotides upstream of 

25 ATG and +1 is meant to indicate one nucleotide downstream of ATG. 

Preferably, the fully impaired consensus Kozak is part of a methionine start 
codon that initiates tranislation of a dominant selectable marker portion of a 
transcriptional cassette which is part of an expression vector. Preferred 
30 dominant selectable markers include, but are not limited to: herpes simplex 
virus thymidine kinase; adenosine deaminase; asparagine synthetase; 
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Salmonella his D gene; xanthine guanine phosphoribosyl transferase; 
hygromydn B phosphotransferase; and neomycin phosphotransferase. Most 
preferably, the dominant selectable marker is neomycin phosphotransferase. 

5 In particularly preferred embodiments of the invention, at least one out-of- frame 
start codon (ie ATG) is located upstream of the fully impaired consensus Kozak 
start codon, without an in-frame stop codon being located between the upstream 
start codon and the fuUy impaired consensus Kozak start codon. As used herein, 
the term "stop codon" is meant to indicate a codon which does not encode an 
10 amino add such that translation of the encoded material is terminated; this 
definition includes, in particular, the traditional stop codons TAA, TAG and 
TGA. As used herein, the terms "in-frame" and "out-of-frame" are relative to the 
fully impaired consensus Xozak start codon. By way of example, in the following 
sequence: 

15 

-3 +1 

GAC CAT GGC Cxx ATG Cxx 

the underlined portion of the sequence is representative of a fully impaired 
20 consensus Kozak (where "x" represents a nucleotide) and the codons GAC, CAT 
and GCC are "in-frame** codons relative to the ATG start codon. The above-lined 
nucleotides represent an "out-of-frame" start codon which is upstream of the 
fully impaired consensus Kozak start codon. Preferably, the out-of-frame start 
codon is within about 1000 nudeotides upstream of the fully impaired consensus 
25 Kozak start codon» more preferably within about 3S0 nudeotides upstream of the 
fully impaired consensus Kozak start oodon, and most preferably within about 50 
nudeotides upstream of the fully impaired consensus Kozak start codon. 
Preferably, the out-of-frame start codon is a part of a consensus Kozak. By way 
of example, the sequence set forth above satisfies this criteha: the -5 nucleotide 
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is a purine (G); nucleotide -61 •? and -8 encode an out- 
and nucleotide -11 is a purine (A). 



PCT/US93/U221 
f-frame start codon (ATG); 



Additionally, utilization of a fully impaired consensus Kozak within a secondary 
5 structure (ie a so-called "stem-loop" or "hairpin") is beneficially viable to 
impairment of translation of the protein encoded by the dominant selectable 
number. In such an embodiment, the start codon of the fully impaired consensus 
Kozak is most preferably located within the stem of a stem loop. 

10 Particularly preferred expression vectors which incorporate these aspects of the 
invention disclosed herein are referred to as TCAE." and "ANEX*' and 
"NEOSPLA" vectors; particularly preferred vectors are referred to as ANEX 1, 
ANEX 2 and ME0SPLA3P. 

15 These and other aspects of the invention disclosed herein will be delineated in 
further detail in the sections to follow. 

ff pfirp nFJinRiPno^f QF TF^ nnAWTNGS 

20 Figure 1 provides the relevant portion of a consensus Kozak and several 
particularly preferred fully impaired consensus Kozak sequences; 

Figure 2 provides a diagrammatic representation of the vectors TCAE 5.2 and 
ANEX 1 (TCAE 12) designed for expression of mouse/human chimeric 
25 immunoglobulin, where the immunoglobulin genes are arranges in a tandem 
configuration using neomycin phosphotransferase as the dominant selectable 
marker. 



-8- 



wo 94/1 1523 



PCr/US93/ 11221 



Figure 3 is a histogram comparing protein expression lev Is with the vectors 
TCAE5.2andANEXl; 

Figure 4 provides a diagrammatic repres ntation of the vector ANEX 2 designed 
5 for expression of mouse/human chimeric immuno^obulin, where the 

immunoglobulin genes are arranges in a tandem configuration using neomycin 
phosphotransferase as the dominant selectable marker; 

Figure 5 is a histogram comparing protein expression levels with the vectors 
10 TCAE 5.2, ANEX 1 and ANEX 2; 

Figure 6 provides a diagrammatic representation of a NEOSPLA vector designed 
for expression of mouse/human chimeric immunoglobulin; and 

15 Figures 7A, 7B and 7C are histograms comparing protein expression levels with 
the vectors TCAE 5.2 vs. NEOSPLA 3F (7A). ANEX 2 vs. NE0SPLA3F (7B); and 
GK-NE0SPLA3F vs. NE0SPLA3P (7C). 

DETAILED DESCRIPTION OF PREFERRED EMBODTMENTR 

20 

Disclosed herein are nucleic acid sequence arrangements which impair 
translation and initiation of, most preferably, dominant selectable markers 
incorporated into mammalian expression vectors and which are preferably, but 
not necessarily, co-linked to an encoding sequence for a gene product of interest. 
25 Preferably, the dominant selectable marker comprises at least one natural or 
artificial intronic insertion region, and at least one gene product of interest is 
encoded by DNA located within at least one such intron. Such arrangements 
have the effect of increasing expression efficiency of the gene product of interest 
by, inUr alia, decreasing the number of viable colonies obtained from an 
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equivalent amount of plasmid DNA traasfected per cell, while increasing the 
amount of gene product expressed in each clone. 



For pi»rpoM of brevity and preeentational efficiency, the focus of this section of 
5 the patent disclosure will be prindpally directed to a spedflc dominant selectable 
marker, neomycin phosphotransferase, which is incorporated into a 
mouse/human chimeric immuno^obulin expression vector. It is to be 
understood, however, that the invention disclosed herein is not intended, nor is it 
to be construed, as limited to these particular systems. To the contrary, the 
10 disclosed invention is applicable to mammalian expression systems in toto. 
where vector DNA is integrated into host cellular DNA. 

One of the most preferred methods utiliied by those in the art for producing a 
mammalian ceU line that produces a high level of a protein (U "production cell 

15 Une") involves random integration of DNA coding for the desired gene product (ie 
"exogenous DNA") by using, most typically, a drug resUtant gene, referred to as 
a "dominant selectable marker," that allows for selection of cells that have 
integrated the exogenous DNA. Stated again, those cells which properly 
incorporate the exogenous DNA including, eg, the drug resistant gene, will 

20 maintain resistance to the corresponding drug. This is most typically followed by 
co-amplification of the DNA encoding for the desired gone product in the 
transfected cell by amplifying an adjacent gene that also encodes for drug 
resistance ("amplification gene"), eg resistance to methotrexate (MTX) in the case 
of dehydrofolate reductase (DHFR) gene. The amplification gene can be the 

25 same as the dominant selectable marker gene, or it can be a separate gene. (As 
those in the art appreciate, "transfection" is typically utilized to describe the 
process or state of introduction of a plasmid into a mammalian host cell, while 
"transformation" is typically utilized to describe the process or state of 
introduction of a plasmid into a bacterial host cell). 

-10- 
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Two amplification approaches are typically employed by those in the art. In the 
first, the entire population of transfected and drug resistant cells (each cell 
comprising at least one integration of the gene encoding for drug resistance) is 
5 amplified; in the second^ individual clones derived from a single cell are 

amplified. Each approach has unique advantages. 

With respect to the first approach, it is somewhat "easier " to amplify the entire 
population (typically referred to as a ""shotgun** approach, an apt description) 

10 compared to individual clones. This is because amplification of individual clones 
initially involves, inter alia, screening of hundreds of isolated mammalian 
colonies (each derived from a single cell, most of which being single copy 
integrants of the expression plasmid) in an effort to isolate the one or two "grail" 
colonies which secrete the desired gene product at a "high"* level, ie at a level 

15 which is (typically) three orders of magnitude higher than the lowest detectable 
expression level. These cells are also oflen found to have only a single copy 
integration of the expression plasmid. Additionally, ampliiying individual clones 
results in production cell lines which contain fewer copies of the amplified gene 
as compared to amplification of all transfected cells (typically, 10-20 versus 500- 

20 1000). 

With respect to the second approacht production cell lines derived from 
amplifying individual dones are typically derived in lower levels of the dnig(s) 
used to select for those colonies which comprise the gene for drug(8) resistance 
25 and the exogenous gene product (ie in the case of methotrexate and DHFR, <5nM 
versxis l^M). Furthermore, individual clones can typically be isolated in a 
shorter period of time (3-6 months versus 6-9 months). 



10 



15 



20 
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Ideally then, the tangible benefits of both approaches should be merged: at a 
practical level, this would involve decreasing the number f colonies to be 
screened, and increasing the amount product secreted by these colonies. The 
present invention sceomplishes this task. 

The position where the DNA of the dominant selectable marker of the plasmid 
DNAis integrated within the cellular DNA of the host cell determines the level 
of expression of the dominant selectable marker protein, as is recognized by 
those in the art. It is assumed that the expression of a gene encoding a protein of 
interest which is either co-linked to or positioned near the dominant selectable 
marker DNA is proportional to the expression of the dominant selectable marker 
protein. While not wishing to be bound by any particular theory, the inventor 
has postulated that if the gene used to select for the integration of the exogenous 
DNA in the mammalian ceU (ie the dominant selectable marker) was designed 
such that translation of that dominant selectable marker was impaired, then 
only those plasmids which could overcome such impairment by over-production 
of the gene product of the dominant selectable marker would survive, eg, the 
dnig-screening process. By associating the exogenous DNA with the dominant 
selectable marker, then, a fiorti, over-production of the gene product of the 
dominant selectable marker would also result in over-production of the gene 
product derived from the associated exogenous DNA. In accordance with this 
postulated approach, impaiment of translation of the dominant selectable 
marker gene would be necessary, and an avenue for such impairment was the 
consensus Kozak portion of the gene. 

By comparing several hun(fred vertebrate mRNAs. Marilyn Kozak in "Possible 
role of flanking nucleotides in recognition of the AUG initiator codon by 
eukaryotic ribosomes." Nuc Acids Res. 9: 5233-5252 (1981) and "Compilation and 
analysis of sequences upstream from the translational start site in eukaryotic 

.12. 
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mRNAs/* Nuc. Acids Res. 12: 857-872 (1984). proposed th foUowing "consensus 
sequence for initiation of translation in higher eukaryotea: 

-3 +1 . 

5 QQ Acc Atlc; G 

(As those in the art appreciate, uradl. U. replaces the deoxynudeotide thymine. 
T. in RNA.) In this sequence, referred to as a "consensus Kozak." the most 
highly ccnsenred nucleotides are the purines. A and G. shown in capital letters 

10 above; mutational analysis confirmed that these two positions have the strongest 
influence on initiation. See. eg. Kozak. M. "Effects of interdstronic length on the 
efEdency of reinitiation of eukaryoticribosomes." Mot. CeUBio. 7/10: 3438-3445 
(198,7). Kozak further determined that alterations in the sequence upstream of 
the consensus Kozak can effect translation. For example, in "Influences of 

15 mRNA secondary structure on initiation by eukaryotic ribosomes." PNAS 83: 
2850-2854 (1986) Kozak describes the "artificial" introduction of a secondary 
hairpin structure region upstream fhan the consensus Kozak in several plasmids 
that encoded preproinsulin; it was experimentally determined that a stable stem 
loop structure inhibited translation of the preproinsulia gene, redudng the yield 

20 of proinsulin by 85-95%. 

Surprisingly, it was discovered by the inventor that by changing the purines A(.3 
vis-a-vis ATG start codon) and G (+1) to pyrimidines. translation impairment 
was significant when the consensus Kozak for the neomydn phosphotransferase 
25 gene was subjected to such alterations (as will be set forth in detail below), the 
number of G418 resistant colonies significanUy decreased; however, there was a 
significant increase in the amount of gene product expressed by the individual 
G418 resistant dones. As those in the art will recognize, this has the effect of 
increasing the effidency of the expression system-there are less colonies to 

30 screen, and most of the colonies that are viable produce significanUy more 
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product than would ordinarily be obtained. Confirmation of the inventor's 
postulated theory was thus experimentally detennined. 



As noud, for purposes of this patent document a, "consensus Kozak" comprises 
5 the following sequence: 

-3 +1 
Pxx ATG Pxx 

10 a "partially impaired consensus Korak" comprises the following sequence 

-3 >1 
P/Pyxx ATG P/Pyxx 



15 and a disclosed and claimed "fully impaired consensus Koxak" comprises the 
following sequence: 

-3 +1 
Pyxx ATG Pyxx 

20 

where: "x" is a nucleotide selected from the group consisting of adenine (A), 
guanine (G), cytosine (C) or thymine (D (uracB. U. in the case of RNA); "P" is a 
purine, ie A or G. *Py" is a pyridine, ie C or TAJ; ATG is a conventional start 
codon which encodes for the amino add methionine (Met); the numerical 

25 designations are relative to the ATG codon, ie a negative number indicates 
"upstream" of ATG and a positive number indicates "downstream" of ATG; and 
for the partially impaired consensus Kozak, the foUowing proviso is appUcable- 
only one of the -3 or ^1 nucleotides is a pyridine, eg, if -3 is a pyridine, then +1 
must be a purine or if -3 is a purine, then +1 must be a pyridine. Most 

30 preferably, the fiilly impaired consensus Kozak is associated with the site of 
translation initiation of a dominant selectable marker which is preferably (but 
not necessarily) co-linked to exogenous DNA which encodes for a gene product 
interest. As used herein, "nucleotide" is meant to encompass natural and 

•14- 
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synthetic deoxy- and ribonucleotides as well as modifi d deozy- and 
ribonudeotidea, ie where the 3' OH, 5*0H, sugar and/or heterocyclic base are 
modified, as weU as modification of the phosphate backbone, eg methyl 
phosphates, phosphorothioates and phoaphoramidites. 

Information regarding the gene sequence of the dominant selectable marker is 
preferably known; however, in lieu of the entire sequence, information regarding 
the nucleic add sequence (or amino add sequence) at the site of translation 
initiation of the dominant selectable marker must be known. Stated again, in 
order to effectuate a change in the consensus or partially impaired consensus 
Kozak, one must know the sequence thereof. Changing the consensus or 
partially impaired consensus Kozak to a fully impaired consensus Kozak 
sequence can be accomplished by a variety of approaches well known to those in 
the art induding, but not limited to, site specific mutagenesis and mutation by 
primer-based amplification (eg PGR); most preferably, such change is 
accomplished via mutation by primer-based amplification. This preference is 
prindpally based upon the comparative '*ease'* in accomplishing the task, coupled 
with the efficacy asaodated therewith. For ease of presentation, a description of 
the most preferred means for accomplishing the change to a fully impaired 
consensus Kozak will be provided. 

In essence, mutation by primer-based amplification relies upon the power of the 
amplification proceaa itaelf-aa PGR is routinely utilized, focus will be directed 
thereto. However, other primer-based amplification techniques {eg ligase chain 
reaction, etc) are applicable. One of the two PGR primers ("mutational primer") 
incorporates a sequence which will ensure that the resulting amplified DNA 
product will incorporate the fully impaired consensus Kozak within the 
transcriptional cassette incorporating the dominant selectable marker of 
interest; the other PGR primer is complementary to another region of the 

-15- 
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dominant selectabl marker; a transchptioQal cassette incorporating the 
dominant selectable marker; or a vector which compris s the transcriptional 
cassette. By way of example, the complement to a dominant selectable marker 
which ineludas a consensus Kozak could have the followinff sequence (SEQ ID 

5 NO: 1): 

3 ' -tagctaggTccTACCcc-5 ' 

In order to create a fully impaired consensus Kozak, the mutational primer could 
10 have the following sequence (SEQ ID NO: 2) (for convenience, SEQ ID NO: 1 is 
placed over the mutational primer for comparative purposes): 

5 ' -atcgatccTggATGCgg-3 ' 
3 • -tagctaggTccTACCcc-5 ' 

15 

As is evident, complementarity is lacking in the primer (see ,the symbols). By 
utilizing excess mutational primer in the PGR reaction, when the sequence 
including the consensus Kozak is amplified, the resulting amplified DNA 
20 products will incorporate the mutations such that as the amplified DNA products 
are in turn amplified, the mutations will predominate such that a fuUy impaired 
consensus Kozak will be incorporated into the amplification product. 

Two criteria are required for the mutational primer-first, the length thereof 
25 must be sufiBdent such that hybridization to the target will result As will be 
appreciated^ the mutational primer will not be 100% complementary to the 
target. Thus, a sufficient number of complementary bases are required in order 
to ensure the requisite hybridization. Preferably, the length of the mutational 
primer is between about 15 and about 60 nucleotides, more preferably between 
30 about 18 and about 40 nucleotides, although longer and shorter lengths are 
viable. (To the extent that the mutational primer is also utilized to incorporate 
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an out-of-fram start codon or secondary structure, the length of the mutational 
primer can correspondingly increase). Second, the ratio of mutational primer to 
target must be sufficiently excessive to force" the mutation. Preferably, the 
ratio of mutational primer to target is between about 250:1 to about 5000:1. more 
3 preferably between about 400:1 to about 2500:1, and most preferably between 
about 500:1 to about 1000:1. 

Because the parameters of a PCR reaction are considered to be well within the 
level of skill of those in the art, details regarding the particulars of that reaction 
10 are not set forth herein; the skilled artisan is readily credited with recogni2dng 
the manner in which this type of mutation can be accomplished using PCR 
techniques - the foregoing is provided as a means of providing elucidation as 
opposed to detailed edification. 

15 As noted, it is most preferred that the fully impaired consensus Kozak is 

associated with the site of translation initiation of a dominant selectable marker 
incorporated into a transcriptional cassette which forms a part of an expression 
vector. Preferred dominant selectable markers include, but are not limited to: 
herpes simplex virus thymidine kinase; adenosine deaminase; asparagine 

20 synthetase; Salmonella his D gene; xanthine guanine phosphoribosyl transferase 
("XGPRT*); hygromydn B phosphotransferase; and neomycin 
phosphotransferase (TIEO"). Most preferably, the dominant selectable marker 
is NEO. 

25 The dominant selectable marker herpes simplex virus thymidine kinase is 
reported as having the following partially impaired consensus Kozak: 

-3 +1 
eg Cgt ATG Get 

30.. 
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See, Heller, S. "Insertional Activities of a Proxnotorless Thymidine Kinase Gene, * 
MoL & Cell Bio. S/S:3218-3302, Figure 4, nucleotide 764. By changing the +1 
purine (G) to a pyrimidine (C or T/U)» a fiilly impaired Kozak as defined herein is 

generated (the -3 of the herpes simplex virus thymidine kinase is a pyrimidine). 
5 Changing -fl purine to a pyrimidine also has the effect of changing the encoded 
amino add from alanine (GCD to proline (CCD or serine (TCT): it is preferred 
that conservative amino add changes result from the changes to the nucleotides. 
Thus, it is preferred that the change to TCT be made because the change from 
alanine to serine is a more conservative amino add change than changing 
10 alanine to proline. 

Histidinol dehydrogenase is another dominant selectable marker. See, Hartmen. 
S.C. and Mulligan, R.C. "Two dominant acting selectable markers for gene 
transfer studies in mammalian cells." PNAS 55:8047-8051 (1988). The his D 
15 gene of Salmonella typhimunium has the following partially impaired consensus 
Kozak: 



20 



25 



-3 +1 
gc Aga ATG Tta 



As -3 is a purine, changing *3 to a pyrimidine (C or TAJ) results in a fully 
impaired consensus Kozak; as is appredated, because these nudeotides are 
upstream of the start codon, no impact on amino add translation results from 
this change. 

Hygromydn B phosphotransferase is another dominant selectable marker; the 
reported sequence for the hph gene (see, Gritz, L. and Davies. J. "Plasmid 
encoded hygromydn B resistance: the sequence of hygromycin B 
phosphotransferase gene and its expressing in Escherichia coli and 
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Succharomyces cerevisiae ' Gene 25:179-188, 1983) indicates that the consensus 
Kozak is: 

-3 +1 
ga Gat: ATG Aaa 

5 

DtflK «a unU 4-1 uro puHnas; Utu« ohnnginff ^a auid 4>1 to |»x«<mi4ina« maul 10 lu u 

fully impaired consensus Kozak (this results in the following encoded amino 
acids: -»-l to C - glutaxnine; -i-l to T - stop codon. Because this codon is 
downstream of the start codon, the change to the stop codon TAA should not be 
10 accomplished). 

XGPRT is another dominant selectable marker. The reported partially impaired 
consensus Kozak of XGPRT has the following sequence: 

15 -3 +1 . 

tt Cac ATG Age 

See, Mulligan, R.C. and Berg» P. "Factors governing the expression of a bacterial 
gene in mammalian cells." MoL & Cell Bio. i 75:449-459 (1981), Figure 6. By 
20 changing the 4-1 purine to a pyrimidine, a fully impaired consensus Kozak is 
created; the effect on the encoded amino add (AGC-serine) is as follows: CGC- 
arginine; TGC-cysteine. 

Adenosine deaminase (ADA) can also be utilized as a dominant selectable 
25 marker. The reported consiensus Kozak sequence for adenosine deaminase is: 

-3 +1 
ga Acc ATG Gcc 

30 See, Yeung» C.Y. et al., "Identification of functional murine adenosine deaminase 
cDNA clones by complementation in Echerichia coli," J. Bio. Chem. 
260/ 18:10299-10307 (1985), Figure 3. By changing both -3 and +1 purines to 
pyhmidines, fiilly impaired consens Kozak sequences result The encoded amino 
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add corresponding to GCC (alanine) is changed t either proline (CCC) or serine 
(TCC). with the change to serine being pr ferred, due to th conservative nature 
of this change. 

5 The reported partiaUy impaired consensus Kozak for asparagine synthetase is as 
follows: 

-3 -t-l 
gc Acc ATG Tgt 

10 

See, Andnilis, LL. et ai, "Isolation of human cDNAs for asparagine synthetase 
and expression in Jensen rat sarcoma cells/' MoL Cell Bio. 7/7:2435-2443 
(1987). Changing the -fS purine to a pyrimidine results in a fiilly impaired 
consensus Kozak. 

15 

The partially impaired consensus Kozak for neomycin phosphotransferase (which 
includes an upstream out of frame start codon) is as follows: 

-3 -^1 

20 qqr A rr. q gga tcg ttt Cgc ATG Att 

Changing the +1 purine to a pyrimidine has the effect of creating a fully 
impaired consensus Kozak (changes to the encoded amino add isoleucine, ATT 
are as follows: CTT - leucine and TTT-phenylalanine. with the change to leucine 
25 being preferred, due to the conservative nature of this change). 

The foregoing is not intended, nor is it to be construed as limiting; rather, in the 
context of the disclosed invention, the foregoing is presented in an effort to 
provide equivalent examples of changes in the reported consensus Kozak 
30 sequences or partially impaired consensus Kozak sequences of several well- 
known dominant selectable markers. 
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As noted, a most preferred dominant selectable marker is NEO. Particularly 
preferred fully impaired consensus sequences for NEO are as follows: 

Txx ATG Ctt (SEQ ID NO: 3) 

Cxx ATG Ctt (SEQ ID NO: 4) 

Txx ATG Ttt (SEQ ID NO: 5) 

Cxx ATG Ttt (SEQ ID NO: 6) 

where x are nucleotides. SEQ. ID. NO. 3 is most preferred; and zz are preferably 
CC. 

Other transcriptional cassettes, which may or may not include a fully impaired 

consensus Kozak, can be incorporated into a vector which includes 

transcriptional cassettes containing the disclosed and daimed fully impaired 

consensus Kozak; such "other transcriptional cassettes" typically are utilized to 

allow for "enhancement," "amplification" or **regulation" of gene product 

repression. For example, co-transfection of the exogenous DNA with the 

dehydrofolate reductase (DHFR) gene is exemplary. By increasing the levels the 

antifolate drug methotrexate (MTX), a competitive inhibitor of DHFR, presented 

to such cells, an increase in DHFR production can occur via amplification of the 

DHFR gene. Beneficially, extensive amounts of flanking exogenous DNA will 

also become amplified; therefore, exogenous DNA inserted co-linear with an 

expressible DHFR gene will also become overexpressed. Additionally, 

transciptional cassettes which allow for regulation of expression are available. 

For example, temperature sensitive COS cells, derived by placing SV40ts mutant 

large T antigen gene under the direction of Rous sarcoma virus LTR (insensitive 

to feedback repression by T antigen), has been described. See, 227 Science 23-28 

( 1985). These cells support replication from SV40 ori at SS^'C but not at 40''C 

and allow regulation of the copy number of transfected SV40 on-containing 

vectors. The foregoing is not intended, nor is it to be construed, as limiting; 
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rather the foregoing is intended to be exemplary of the types of cassettes which 
can be incorporated into expression vectors comprising the disclosed fully 
impaired consensus Kozak. The skilled artisan is credited with the ability to 

5 objective of the expression system, which are applicable and which can be 

advantageously exploited. 

As indicated above, in particularly preferred embodiments of the invention, at 
least one out-of-frame start codon (ie ATG) is located upstream of the folly 
10 impaired consensus Kozak start codon, without a stop codon being located 
between the out-of-frame start codon and the fully impaired consensus Kozak 
start codon. The intent of the out-of-frame start codon is to. in effect, farther 
impair translation of the dominant selectable maricer. 

15 As used herein, the term "stop codon" is meant to indicate a "nonsense codon/' ie 
a codon which does not encode one of the 20 naturally occurring amino adds such 
that translation of the encoded material terminates at the region of the stop 
codon. This definition includes, in particular, the traditional stop codons TAA. 
TAGandTGA. 

20 

As used herein, the term "out-of-frame" is relative to the fully impaired 

consensus Kozak start codon. As those in the art appreciate, in any DNA 

macromolecule (or UNA macromolecule) for every in-frame sequence, there are 

two out-of-frame sequences. Thus, for example, with respect to the following 

25 sequence incorporating a fully impaired consensus Kozak: 

-3 +1 
gr A TG e g AT G ge CXX ATG CXX 

30 the in-frame codons are separated by triplets, eg, gcA. TGc. cAT and Ggc; the out 
of.frame codons would include, eg cAT. ATG, Gcc, ccA, ATG and TGg. Thus, two 
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start codons (in capital letters and underlined) are out-of-frame relative to the 
start codon of the fully impaired consensus Kozak. 



Wh«n auoh an Q^t-of-iraiM start sedan is utilized, it is Draferrctd that this be 
5 within about 1000 nucleotides upstream of the fully impaired consensus Kozak 
start codon, more preferably within about 350 nucleotides upstream of the fully 
impaired consensus Kozak start codon, and most preferably within about 50 
nucleotides of the fully impaired consensus Kozak start codon. ] 

10 As is appreciated, the upstream sequence can be manipulated to achieve 

positioning at least one out-of*frame sequence upstream of the fiilly impaired 
consensus Kozak start codon using (most preferably) a mutational primer used in 
the type of amplification protocol described above. 

15 Utilization of a fully impaired consensus Kozak start codon located within a 
secondary structure (ie a **stem4oop" or "hairpin") is beneficially viable to 
impairment of translation of the protein encoded by the dominant selectable 
marker. In such an embodiment, it is preferred that this start codon be located 
within the stem of a stem loop secondary structure. These, by way of schematic 

20 example, in such an embodiment, the start codon of the fully impaired consensus 
Kozak is positioned as follows: 
T 

X X 

25 XX 
X A 
X T 
X G 
xA TGxx Cxx 



30 



(An out-oif-frame start codon which is not part of the secondary structure is also 
represented.) As is appreciated, within the stem loop, complementarity along 
the stem is, by definition, typically required. For exemplary methodologies 
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regarding, inter alia, introduction of such secondary structures into the 
sequence, as well as information regarding secondary structure stability, see, 
Kozak. PNAS, 1986. supra. 

5 As noted, it is preferred that a dominant selectable market with a naturally 
occurring intronic insertion region or an artificially created intronic insertion 
region be utilized and at least one gene product of interest inserted within this 

region. While not wishing to be bond by any particxdar theory, the inventor 

j 

postulates that such an arrangement increases expression efficiency because the 
10 number of viable colonies that survive the selection process via^a-vis the 
dominant selectable marker will decrease; the colonies that do survive the 
selection process will, by definition, have expressed the protein necessary for 
survival, and in conjunction therewith, the gene product of interest will have a 
greater tendency to be expressed As further postulated, the RNA being 
15 transcribed from the gene product of interest within the intronic insertion region 
interferes with completion of transcription (elongation of RNA) of the dominant 
selectable marker, therefore, the position that the dominant selectable marker is 
integrated within the cellular DNA is likely to be a position where a larger 
amount of RNA is initially transcribed 

20 

As is appreciated, prokaryotic proteins do not typically include splices and 
introns. However, the majority of dominant selectable mariners which are 
preferred for expression vector technology are derived from prokaryotic systems. 
Thus, when prokaryotic-derived dominant selectable markers are utilized, as is 
25 preferred, it is often necessary to generate an artificial splice within the gene so 
to create a location for insertion of an intron comprising the gene product of 
interest. It is noted that while the following rules are provided for selection of a 
splice site in prokaryotic genes, they can be readily applied to eukaryodc genes. 
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A general m chanism for the splidzig of m ssenger RNA precurs rs in eukaryotic 
cells is delineated and summarized in Sharp, Philip A. "Splicing of Messenger 
RNA Precursors" Science, 235: 736-771 (1987) (see. in particular. Figure 1) 
wKiah ie inearperatad harain by roforenea. Bated upon Sharp, tham are four 
5 minimum criteria in the nucleic add sequence which are necessary for a splice: 
(a) 5* splice donor, (b) 3' splice acceptor, (c) branch point, and (d) polypyrimidine 
tract. The consensus sequences for the 5* splice donor is reported to be 
C A 

AAG/GTGAGT and for the 3' splice acceptor. NCAG/G (where a Y' symbol 
indicates the splice site); see, Mount* S.M. "A Catalogue of Splice Junction 

10 Sequences" Nuc. Acids. Res. i0/2:459-472 (1992) which is incorporated herein by 
reference. The consensi2S sequence for the branch point, ie.the location of the 
lariat fonnation with the 5* splice donor, is reported as PyNPyPAPy; and the 
reported preferred branch site for mammalian RNA splicing is TACTAAC 
(Zhuang, Y. et al. "UACUAAC is the preferred branch site for mammalian mRNA 

15 splicing" PNAS 86: 2752-2756 (1989), incorporated herein by reference). 

Typically, the branch point is located at least approximately 70 to about 80 base- 
pairs from the S'-splice donor (there is no defined upper limit to this distance). 
The poly pyrimidine tract typically is from about 15 to about 30 base pairs and is 
most typically bounded by the branch point and the 3' splice acceptor. 

20 

The foregoing is descriptive of the criteria imposed by nature on naturally 
occurring splicing Tn^^*'^«^«'"« Because there is no exact upper limit on the 
number of base pairs between the 5' splice donor and the branch point, it is 
preferred that the gene product of interest be inserted within this region in 
25 situations where a natural intron exists within the dominant selectable marker. 
However, as noted, such introns do not exist within most of the preferred 
dominant selectable markers; as such, utilization of artificial introns are 
preferably utilized with these markers. 
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In order to generate an artificial intron, a 'splice donorsplice acceptor * site must 

be located within the encoding region of the dominant selectable marker. Based 

upon Sharp And Mount, it ifi most preferred that the followinff aequence function 
5 as the splice donoraplice acceptor site - CAGG (with the artificial splice 

occurring at the GG region). A preferred sequence is AAGG. 



Focusing in on the most preferred sequence CAGG, the following codons and 
amino adds can be located within the encoding region of the dominant selectable 
10 marker for generation of the artificial intronoc insertion region: 

A 3 C 

CAG/ GNN NCA G/GN NNC AG/G 

Gin Ala Ala Gly Ala Leu Arg 

Asp Pro Arg Phe 

Gly Ser Aan Pro 

Glu Thr Asp Ser 

Val Cys Thr 

Gly Tyr 
HiB Val 

ne 



(As will be appreciated* the same approach to determining viable amino acid 
residues can be utilized for the preferred sequence of AAGG). The most 
15 preferred codon group for derivation of the splice donorsplice acceptor site is 
group A. Once these amino acid sequences are located, a viable point for 
generation of an artificial ixxtronic inaertion region can be defined. 



Focusing on the preferrred NEO dominant selectable marker, amino acid 
20 residues Gin Asp (codon group A) are located at the positions 51 and 52 of NEO 
and amino acid residues Ala Arg (codon group C) are located at positions 172 and 
173 (as is appreciated, multiple artificial intronic insertion regions may be 
utilized). Focusing on residues 50 • 53 of NEO, the nucleic add and amino acid 
sequences are as foUows: 
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50 51 52 53 

5' CTG CAG GAC GAG 3' 
Leu Gin Asp GIu 

Accordingly, an artificial intronic insertion region can be generated between 
a fMViduBS 61 anil da of KSO. ThU rvtflett mn%% ppmtWw^hXr oami»Ha«» « bpMttvh 

point, a polypriznidine tract and, preferably, a region for insertion of a gene 
product of interest, le a region amendable to enzymatic digestion. 

Two criteria are import for the artifidal intronic insertion region: the first two 
10 nucleic acid residues of the 5' splice site ( eg abutting CAG) are most preferably 
GT and the first two nucleic add residues of the 3' splice site (eg abutting G) are 
most preferably AG. 



Using the criteria defined abouve, an artifidal intronic insertion region was 
15 between amino add residues 51 and 52 of NEO: 

50 NEO 51 52 NEO 53 

LEU GLN Branch Polypyhmide Tract ASP GLU 3' 

5' CTG CAGfiZAAGT GCGGCCGC TACTAACCTQs CltOsTCCCT^ C CTGG^fiCAC GAG 

Not I 

20 

(Details regarding the methology for creating this artifidal intronic insertion 
region are set forth in the Example Section to follow). The Not I site was created 
as the region where the gene product of interest can be incorporated. Therefore, 
upon incorporation^ the gene product of interest is located between amino add 
25 residues 5 1 and 52 of NEO, such that during NEO transmission, the gene 
product of interest will be "spliced-out". 



The host cell line is most preferably of mammalian origin; those skilled in the art 
are credited with ability to preferentially determine particular host cell lines 
30 which are best suited for the desired gene product to be expressed therein. 
Exemplary host cell lines indude» but are not limited to, DG44 and DXBU 
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carcinoma), CVl (monkey kidney line). COS (a derivative of CVl.with SV40 T 
antigen). R1610 (Chinese hamster fibroblast) BALBC/3T3 (mouse fibroblast). 
HAK (hamster kidney line). SP2/0 (mouse myeloma), P3x63-Ag3-653 (mouse 
myeloma), BFA-ldBPt (bovine endothelial cells). RAJI (human lymphocyte) and 
5 293 (human kidney ). Host cell lines are typically available from commercial 
services, the American Tissue Culttire Collection or firom published literature. 

Preferably the host cell line is either DG44 or SP2/0. See, Urland, G. et aL 
"Effect of gamma rays and the dihydrofolate reductase locus: deletions and 

10 inversions." Som. Cell & MoL Gen. 12/6:555-566 (1986) and Shulman. M. et aL. 
"A better cell line for making hybridomas secreting specific antibodies." Nature 
275:269 (1978). respectively. Most preferably, the host cell line is DG44. 
Transfection of the plasmdd into the host cell can be accomplished by any 
technique available to those in the art. These include, but are not limited to, 

15 transfection (including electrophoresis and electroporation), cell fusion with 
enveloped DNA, microinjection, and infection with intact virus. See, Ridgway, 
A.A.G. "Mammalian Expression Vectors." Chapter 24.2. pp. 470-472 Vectors, 
Rodriguez and Denhardt, Eds. (Butterworths, Boston, MA 1988). Most 
preferably, plasmid introduction into the host is via electroporation. 

20 

ETAMPLES 

The following examples are not intendedt nor are they to be construed, as 
limiting the invention; the examples are intended to demonstrate the 
25 applicability of an embodiment of the invention disclosure herein. The disclosed 
fully impaired consensus Kozak sequence is intended to be broadly applied as 
delineated above. However, for presentational eEBdency, exemplary uses of 
particularly preferred embodiments of fully impaired consensus Kozak sequences 
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are utilized in conjunction with tandem chimeric antibody expression vectors 
(also referred to herein aa antibody expression vectors) as disclosed below. 



L TANDEM CHIMERIC ANTIBODY EXPRESSION CTCAE*) VECTOR 

5 

B cell lymphocytes arise from pluripotent stem cells and proceed through 
ontogeny to fiiDy matured antibody secreting plasma cells. The human B 
lymphocyte-restricted difTerentiation antigen Bp35, referred to in the art as 
' CD20," is a cell surface non-giycosylated phosphoprotein of 35,000 Daltons; 

10 CD20 is expressed during early pre-B cell development just prior to the 

expression of c3rtoplasmic p. heavy chains. CD20 is expressed consistently until 
the plasma cell differentiation stage. The CD20 molecule regulates a step in the 
activation process which is required for cell cycle initiation and difTerentiation. 
Because CD20 is expressed on neoplastic B cells, CD20 provides a promising 

15 target for therapy of B cell lymphomas and leukemias. The CD20 antigen is 
especially suitable as a target for anti-CD20 antibody mediated therapy because 
of accessibility and sensitivity of hematopoietic tumors to lysis via immune 
effector mechanisms. Anti-CO20 antibody mediated therapy , inter alia, is 
disclosed in co-pending Serial No. 07/978,891 and Serial No. , filed 

20 simultaneously herewith. The antibodies utilized are mouse/human chimeric 
anti-CD20 antibodies expressed at high levels in mammalian cells (chimeric anti- 
CD20"). This antibody was derived using vectors disclosed herein, to wit: TCAE 
5.2; ANEX 1; ANEX 2; GKNE0SPLA3F; and NE0SPLA3F (an additional 
vector, TCAE 8, was also utilized to derive chimeric anti-CD20 antibody - TCAE 

25 8 is identical to TCAE 5.2 except that the NEO translational start site is a 
partially impaired consensus Kozak. TCAE 8 is described in the co-pending 
patent docimient filed herewith.). 
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In commonly-assigned United States Serial Number 07/912,292, disclosed, inter 
alia, are human/Old World m nkey chimeric antibodies; an embodiment of the 
invent! n disclosed therein are human/macaque chimeric anti*CD4 antibodies in 
vector TCAE 6 Uee, Figure 6 of Serial Number 07/912,292, and corresponding 
5 discussion). TCAE 6 is substantially identical to TCAE 5.2; TCAE 6 contains 
human lambda constant region, while TCAE 5.2 contains human kappa constant 
region. TCAE 5.2 and ANEX 1 (referred to in that patent document as TCAE 12) 
are disclosed as vectors which can be utilized in coz^unction with human/Old 
World monkey chimeric antibodies. The comparative data set forth in Serial No. 
10 07/912,292 vis-a-vis TCAE 5.2 and ANEX 1 is relative to expression of chimeric 
anti-CD20 antibody. 

TCAE 5.2 was derived from the vector CLDN, a derivative of the vector 
RLDNlOb (see, 253 Science 77-91, 1991). RLDNlOb is a derivative of the vector 
TND (see, 7DNA 651-661, 1988). le the vector "family line" is as follows: TDN 
^ RLDNlOb CLDN TCAE 5.2 ANEX 1 (the use of the symbol is not 
intended, nor is it to be construed, as an indication of the effort necessary to 
achieve the changes from one vector to the next; e,g, to the contrary, the number 
and complexity of the steps necessary to generate TCAE 5.2 from CLDN were 
extensive). 

TND was designed for high level expressions of human tissue plasminogen 
activator. RLDNlOb differs from TND in the following ways: the dihydrofolate 
reductase ("DHFR") transcriptional cassette (comprising promoter, murine 
25 DHFR cDNA, and polyadenylation region) was placed in between the tissue 
plasminogen activator cassette ("t-PA expression cassette") and the neomycin 
phosphotransferase ("NEO" cassette") so that all three cassettes were in tandem 
and in the same transcriptional orientation. The TND vector pennitted selection 
with G418 for cells carrying the DHFR, NEO and t-PA genes prior to selection 
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for OHFR gene amplificataon in response to methotrexate* MIX The promoter 
in front of the DHFR gene was changed to the mouse beta globin major promoter 
(see. 3 Mol. Cell Bio. 1246-1264, 1983). Finally, the t-PA cONA was replaced by a 
polylinker such that different genes of interest can be inserted in the polylinker. 
5 All three eukaryotic transcriptional cassettes (t-PA, DHFR, NEO) of the TND 
vector can be separated from the bacterial plasmid DNA (pUCl9 derivative) by 
digestion with the restriction endonuclease Not I. 

CLDN differs from RLDNlOb in the following ways: The Rous LTR, positioned 
10 in front of the polylinker, was replaced by the human cytomegalovirus immediate 
early gene promoter enhancer ("CMV"), (see, 41 Cell 521, 1985), from the Spe I 
site at -581 to the Sst I site at -16 (these numbers are from the Cell reference). 

As the name indicates, TCAE vectors were designed for high level expressions of 
15 chimeric antibody. TCAE 5.2 differs from CLDN in the following ways: 



A. TCAE 5.2 comprises four (4) transcriptional cassettes, as opposed to three 
(3), and these are in tandem order, ie a human immunoglobulin light chain 
absent a variable region; a human immunoglobulin heavy chain absent a 

20 variable region; DHFR; and NEO. Each transcriptional cassette contains its own 
eukaryotic promoter and polyadenylatin region (reference is made to Figure 2 
which is a diagrammatic representation of the TCAE 5.2 vector). The CMV 
promoter/enhancer in front of the immunoglobulin heavy chain is a truncated 
version of the promoter/enhancer in front of the light chain, from the Nhe I site 

25 at -350 to the Sst I site at -16 (the numbers are from the Cell reference, supra;. 
Specifically. 

1) A human immunoglobulin light chain constant region was derived 
via amplification of cDNA by a PCR reaction. In TCAE 5.2, this was the human 

;3l. 
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immunoglobulin light chain kappa constant region (Kabat numbering, amino 
adds 108-214, allotype Km 3), and the human immunoglobulin heavy chain 
gamma 1 constant region (Kabat numbering amino adds 114-478, allotype Gmla. 
Gmlz). The light chain was isolated from normal human blood (IDEC 
5 Pharmaceuticals Corporation, La Jolla, CA); RNA therefrom was used to 

synthesize cDNA which was then amplified using PGR techniques (primers were 
derived vis-a-vis the consensus Kabat). The heavy chain was isolated (using 
PGR techniques) from cDNA prepared from RNA which was in torn derived from 
cells transfected with a human IgGl vector (see, 3 Prot Eng. 531, 1990: vector 
10 pNyi62). Two amino adds were changed in the isolated human IgGl to match 
the consensus amino add sequence in Kabat, to wit amino add 225 was 
changed from valine to alanine (GIT to GCA). and amino add 287 was changed 
from methionine to lysine (ATG to AAG); 

15 2) The human immunoglobulin light and heavy chain cassettes 

contain synthetic signal sequences for secretion of the immunoglobulin chains; 

3) The hiiman immunoglobulin light and heavy chain cassettes 
contain spedfic DNA restriction sites which allow for insertion of li^t and heavy 
20 immunoglobulin variable regions which maint.ain the transitional reading frame 
and do not alter the amino adds normally found in immunoglobulin chains; 



4) The DHFR caasette contained its own eukaxyotic promoter (mouse 
beta globin major promoter, "BETA") and polyadenylation region (bovine growth 

25 hormone polyadenylation, "BGH"); and 

5) The NEC cassette contained its own eukaryotic promoter (BETA) 
and polyadenylation region (SV40 early polyadenylation, "SV"). 
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With respect to the TCAE 5.2 and the NEO cassette, the Kozak region was a 
consensiis Kozak (which included an upstream Cla I site SEQ ID NO: 7): 



10 



Clal -3 +1 
TTGGGAGCTTGG' ATCGAT CCAcc ATG Gtt 



ANEX 1 (previously named TCAE 12 in the referenced case) is identical to TCAE 
5.2 except that in the NEO cassette, the Kozak region was fully impaired (SEQ 
ID NO: 8): 



Clal -3 +1 
TTGGGAGCTTGG ATCGAT CcTcc ATG Ctt 



As disclosed in the commonly-assigned referenced case, the impact of utilization 

15 of the fully impaired consensus Kozak was striking: relative to TCAE 5.2, there 
was a significant (8-fold) reduction in the number of ANEX 1 G418 resistant 
colonies (258 from two electroporations versus 98 from six electroporations) from 
the same amoimt of plasmid DNA transfected per cell; and, there was a 
significant increase in the amount of co-linked gene product expressed in each of 

20 the ANEX 1 clones. Referencing the histogram of Figure 3 (Figure 16 of the 
commonly assigned referenced case), 258 colonies were derived from 2 
electroporations of 25 ^ of DNA containing a neomycin phosphotransferase gene 
with a consensus Kozak at the translation start site. Two-hundred and one (201) 
of these colonies did not express any detectable gene product (less than 25 ng/ml 

25 of chimeric immunoglobulin), and only 8 colonies expressed more than 100 ng/ml. 
Again, referencing Rgure 3, 98 colonies were derived from 6 electroporations for 
ANEX 1 of 25 ^ig of DNA containing a neomycin phosphotransferase gene with 
the fully impaired consensus Kozak at the translation start site (6 
electroporations were utilized in order to generate statistically comparative 

30 values; this was because on average, each electroporation for ANEX 1 yielded 
about 16 colonies, as opposed to about 129 colonies per electroporation for TCAE 
5.2). Eight (8) of the ANEX 1 colonies did not express any detectable gene 
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product (less than 25 ni^ml), while 62 of these colooi s were expressing greater 
than 100 ng/ml; of these 62 colonies, nearly 23 were expressing over 250 ng/ml 
(23%). with 6 expressing greater than 1000 ng/wl (6%). 

5 The foregoing evidences, inter alia, the following: 1) because the diflference 

between TCAE 5.2 and ANEX 1 was limited to the Kozak translation start site of 
the NEO gene, and because the gene product of interest (chimeric anti-CD20 
antibody) was co-linked to the NEO gene, a conclusion to be drawn is that these 
differences in results are attributed solely to the differences in the Kozak 

10 translation start site; 2) it was experimentally confirmed that utilization of a 
fully impaired consensus Kozak in conjunction with a dominant selectable 
number resxilted in significantly less viable colonies; 3) it was experimentally 
confirmed that utilization of a fiilly impaired consensus Kozak in conjunction 
with a dominant selectable marker co-linked to a desired gene product 

15 significantly increased the amount of expressed gene product Thus, the number 
of colonies to be screened decreased while the amount of expressed gene product 
increased. 

II. IMPACT OF OUT-OF-FRAME START SEQUENCE 

20 

Conceptually, further impairment of translation initiation of the dominant 
selectable marker of ANEX 1 could be effectuated by utilization of at least one 
out-of-fiame ATG start codon upstream of the neomycin phosphotransferase 
start codozL Taking this approach one step fiirther, utilization of a secondary 
25 structure ("hairpin") which incorporated the neo start codon within the stem 

thereof, would be prestimed to further inhibit translation initiation. Thus, when 
the out-of-fi-ame start codon/fully impaired consensus Kozak was considered, this 
region was designed such that the possibility of such secondary structures was 
increased. 
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As indicated previously, the Kozak region for the n^o start codon in the ANEX 1 
vector is: 

TTaGCACCTTGG A7CCAT CC Tec ATG Cbt 

5 

The desired sequence for a vector identical to ANEX 1 but incorporating the 
above-identified changes vis-a-vis the neo start codon, referred to as ANEX 2, is 
as follows (SEQ ID NO: 9): 



10 CCA GCiL-lGG AGG A ATCGAT CC Tec ATG Ctt 



(The out-of-frame start codon is underlined.) The fully impaired consensus 
Kozak of ANEX 2 is identical to that of ANEX 1. The principal difference is the 
inclusion of the upstream out-of-frame start codon. A possible difference is the 
formation of a secondary structure involving this sequence* proposed as follows: 



CG ) 
T A } CLA I site 

• AT) 
AT 
GC 
GO 

Ar 

GC 
GC 
TA 
AZ 
Cfi 
GC 
AT 
CT 
CG 



The sequence in bold, ATG, is the upstream out-of-frame start codon; the "loop" 
35 portion of the secondary structure is the CLA I site; and the sequence between 
the "T" and "C** (italics and bold) is the start codon (underlined) of the fully 
impaired consensus Kozak. 



20 



25 
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In order to effectuate this change, a PCR fragment was doned into anti-CD20 in 
ANEX 1 from Xho I (5520) to Cla I (5901); see. Figure 4. Primers were as follows: 

5 3-.Prlmer 489 (8EQ ID NO: 10): 

5'-GGA GGA TCG ATT CC2 CCA IGC IGG 
CAC AAC TAT GTC AGA AGC AAA TGT 
10 GAG C-3' 

The upper-lined portion of Primer 489 is a Cla I site; the under-lined portion is 
the fully impaired consensus Kozak translation start site. 

15 5'-Primer 488 (SEQ. ID. NO. 11): 

5 ' -CTG GGG CfC GA5 CTT TGC-3 ' 



20 



25 



30 



The upper-lined portion of Primer 485 is an Xho I site. 

These primers were prepared using an ABI 391 PCR MATE™ DNA synthesizer 
(Applied Biosystems, Foster City, CA). Phosphoramidites were obtained from 
Cruachem (Glasgow, Scofland): dA(bz) - Prod. No. 20-8120-21; dG(ibu) - Prod. 
No. 20-8110-21: dC(bx) - Prod. No. 20-8130-21; T - Prod. No. 20-8100-21. 

Conditions for the PCR reaction using these primers were as follows: 2X 
("micpoliters") of anti-CD20 in TCAE 5.2 in plasmid grown in E. coU strain 
GM48 (obtained from the ATCC) was admixed with 77J. of deioniied water. 2\ of 
Primer 488 (64 pmoles); and 4X of primer 489 (56 pmoles). This was followed by 
a denaturation step (94-C, 5 min.) and a renaturation Step (54''C, 5 min.). 
Thereafter. 4X of 5 mn dNTPS (Promega, Madison, WI: dATP, Prod. No. U1201; 
dCTP. Prod. No. U1221; dGTP. Prod. No. U1211: dTTp. Prod. No. U1231). U of 
Pfu DNA polymerase (Stratagene. La Jolla. CA Prod. No. 600135. 2.5 U/ml). and 
SOX of mineral oU overlay was added thereto, followed by 30 cycles, with each 
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cycle comprising the following: 12'C, 2 xnin.; 94»C. 1 nan.; 54«C. 1 min. Ten 
microliters (lOX ) of this admixture was analyzed by agarose gel electrophoresis 
(results not shown); a single band was found at about 400 base pairs. 

5 The PGR product and the vector wore prepared fcr ligation as teUowi: Anti-CDSO 
in ANEX 1 plasmid grown in E. eoli bacterial strain GM48 was digested with 
Cla 1 and Xho 1 as follows: 20X of anti.CD20 in ANEX I was admixed with lOX of 
10XJiEB4 buffer (New England Biolabs. Beverly, MA; hereinafter, NEB): Sk Cla 
I (NEB. Prod. No. 197 S. 60u); and 64X deionized water. This admixture 

10 incubated overnight at 37«C, followed by the addition of 5X Xho I (NEB, Prod. 
No. 146 S. 100 u) and incubation at 37'C for 2 hrs. The resulting material is 
designated herein as "Cla 1/Xho 1 cut ANEX 1". The approximat* 400 base pair 
PGR fragment was prepared and digested with Cla 1 and Xho 1 as follows: SOX of 
the PGR fragment was admixed with lOX of 3M NaOAc; IX 10% sodium dodecyl 

15 sulfate (SDS); and SOX phenyyCHCla/isoamyl. This admixture was vortexed for 
30 sec. followed by a 1 min. spin (1700 RPM). The aqueous phase was subjected 
to a spin column which resulted in 85X total admixture. To this admixture was 
added lOX 10XNEB4. IX bovine serum albumin (BSA,100X; NEB). 2X Qa 1 
(24u), and 2X Xho 1 (40 u). This admixture was incubated at 37''C for 2 hrs. The 

20 resulting material is designated herein as "Cla 1/Xhol cut PCR488/48S". Both 
Cla 1/Xho 1 cut ANEX I and Qa 1/Xho 1 cut PCR 488/489 were analyzed by 
agarose gel electrophoresis and the resulting bands were observed at the same 
relative location on the gel (results not shown). 

25 Ligation of Cla 1/Xho 1 cut PCR 488/489 and Cla l/Xho 1 cut ANEX 1 was 
accompUshed as foflows: IX of tRNA (Sigma. St. Louis, MO. Prod. No. R-8508) 
was admixed with U 10% SDS; lOX 3M NaOAc; 45X of Ga 1/Xho 1 cut PCR 
488/489 (about 22.5 ng); a 1:4 dilution (0.25X) of Cla 1/Xho 1 cut ANEX 1 (about 
32 ng) in 0.75X tris-hydroxymethyl aminomethane ethylenediamine tetracetic 

47. 
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acid (TE); and 42X TE. To this admixture was added SOXphenyl/CHCla/Uoamyl, 
foUowed by a 30 8 c. vortM and a 1 min. spin (1700 RPM). The aqueous phase 
was transfeired to a new tube. foUowed by addition of 270X of 100% EtOH (•20« 
O). 10 min. npin HiMO RPM) followed by addition of another 270X of 100% 

5 EtOH (-20- C) and another 1 min. apin (13,000 RPM). This adaixture was dried 
in a SpeedVAC™ and resuspended in 17X TE, 2X Ugase buffer (Promega. T4 
DNA Ligase kit, Prod. No. M180) and U Ugase (Promega Ligase kit). This 
ligation mix incubated at 14'C ovemi^t. Twenty microliters (20X) of the 
ligation mix was admixed with lOX 3M NaOAc. IX 10% SDS. 69X TE. and 90X 

10 phenyyCHGa/isoamyl. This admixture was vortexed for 30 sec. foUowed by a 1 
min. spin (1700 RPM). The aqueous phase was transferred to a new tube and 
270X of 100% EtOH (-20°C) was added thereto. foUowed by a 10 min. spin (1700 
RPM). This admixture was dried in a SpeedVAC™ and resuspended in 20X TE. 
Ten microUtera of the resuspended admixture was transformed in E. coli X-Ll 

15 blue™ (Stratagene, La JoUa. CA). foUowing manufacturer instructions. Ten (10) 
bacterial colonies were inoculated in LB Broth (Gibco BRL. Grand Island. NY. 
Prod. No. M27950B) including ampidllin (50Mg/iml; Sigma, Prod. No. A-9393). 
Plasmids were isolated firom the 10 cultures with a Promega DNA purification 
System (Prod. No. PR-A7100), foUowing manufacturer instructions; these 

20 plasmids may have comprisea the ANEX 2 vector, depending on the sufficiency of 
the foregoing. 

ANEX 2 includes a Hinf I site ("GAATC") upstream of the neo start site (-9 to -13 
relative to the neo start codon); ANEX 1 does not include this Hinf I site. The 
25 purified plasmids comprising putative ANEX 2. and previouriy purified ANEX 1 
standard . were subjected to Hinf 1 digestion as foUows: 2X of each isolaU was 
admixed with 8X of Hinf I digestion buffer (15X 10 x NEB2 buffer 15X Hinf I 
(NEB. Prod. No. 155S. lOu/X); and 90X H2O). This admixture incubated for 3 
hrs. at ZrC and each isolate was analyzed via agarose gel electrophoresis 
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(resuJta not shown); nine (9) of the bands were substantially identical to th 
ANEX X standard, one(l) showed a slight difference in band pattern. For this 
single isolate* the first two bands w re at 1691 and 670 kB; for the ANEX 1 Hinf 
I digested product, the first three bands were at 1691, 766, and 670 kB. The 
5 missing band at 766 kB for the single isolate was attributed to the presence of 
the Hinf I site therein, indicating that the desired change to ANEX 1 was 
incorporated into this vector. This vector was designated "Anti-CD20 in ANEX 2 
(G1,K)," and is generally referred to by the inventor as ANEX 2. 

10 Electroporation of anti-CD20 in ANEX 2 was accomplished as follows: two- 
hundred and forty microliters (240X} of the anti-CD20 in ANEX 2 DNA (400^g) 
was admixed with lOOX of 10 X NEB2 buffer ; lOOX of Stu I (NEB, Prod. No. 
187S, 1000 u); and 560X TE, and incubated at 37*C for 2hrs. This admixture was 
then placed over 8 spin columns (125X each), followed by addition of llOX lOX 

15 Not I buffer (NEB); 10\ lOOX BSA; and 20X of Not I (NEB, Prod. No. 189S. 

SOOu). This admixture was incubated at 37**C for 3 hrs.. foDowed by the addition 
of 120X of 3M NaOAC and 12X of 10% SDS. The admixture was transferred to 2 
vortex tubes and 500X of phenyl/CHCl^lsoamyl was added to each, followed by a 
30 sec. vortex and 1 spin (1700 RPM). The aqueous phase was removed 

20 from the tubes and segregated into 3 tubes, followed by the addition to each tube 
of -20»C 100% ETOH, foDowed by 10 min. spin (13,000 RPM). Thereafter. -20«C 
70% ETOH was added to each tube, followed by 1 min. spin (13,000 RPM). The 
tubes were then placed in a Speed VAC™ for drying, followed by resuspension of 
the contents in lOOX TE in a sterile hood. Five microliters (5X) of the 

25 resuspended DNA was admixed with 995X of deionized water (1:200 dilution). 
An optical density reading was taken (OD«260) and the amount of DNA present 
was calculated to be 0.75^g/X. In order to utilize 25 \xg of DNA for 
electroporation, 32X of the 1:200 dilution of the DNA was utiUzed (25 ^g was 
utilized as this was the amount of DNA utilized for TCAE 5.2 and ANEX 1 in the 
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foregoing Exampl 1). The 1:200 dUution of DNA was formally referred to as 
"Sta 1. Not I cut anti-CD20 in AtTEX 2 (25ug) in TE" and generally referred to as 
••anti-CD20inANEX2." 

5 H08t cells utilited waa DG44 CHO ("CHO") (aee. Urleub. O. Somatic Ctl, idse 
supra). One hundred miUiUters of 6.6 x 106 celh/ml (84%) were subjected to a 2 
man. spin at 1000 RPM. These were washed with 50 ml sucrose buflfered 
solution, followed by 5 min. spin at 1000 RPM; the material was then 
resuspended in 4.5 ml of the sucrose buffered solution. Thereafter, cells were 

10 counted and 0.4 ml of CHO cells ( 4.0 x 10« ceDs) were admixed with 32X of the 
anti-CD20 in ANEX 2 in BTX sterile, disposable electroporation cuvettes. 
Electroporation settings were as follows: 210 volts; 400 microfaraday 
capacitance; 13 ohms resistance, using a BTX 600~ electee ceU manipulator 
(BTX. San Diego, CA). Nine (9) electroporations were conducted; actual voltage 

15 deUvered over actual times were as foUows: M99V. 4.23 msec; 2-188V. 4.S7 
msec; 3- 189V, 4.24 msec, 4-200V; 4.26 msec, 5-200V, 4.26 msec; 6- 199V, 4.26 
msec; 7- 189V, 4.59 msec; 8- 189V, 4.57 msec; 9-201 V, 4.24 msec. (As noted in 
Example I, the difference in number of performed electroporations was 
attributed to the need to achieve a statistically significant number of viable 

20 colonies for each of the three conditions. TCAE 5.2, ANEX 1 and ANEX 2; the 
amount of DNA used for each electroporation (25 tig) was the same for each, and 
the same number of cells were electropoiated. 

Thereafter, the electroporation material was admixed with 20 ml of G418 
25 Growth Media (CHO-S-SFMH nanus hypoxanthiae and thymidine (Gibco, 
Grand Island, NT, Form No. 91-0456PK)including 50 >iM hypoxanthine and 8 
HM thymidine). The admixture was gently agitated, followed by plating 200 ill of 
the admixture per well into 96-well plates, one plate for each electroporation 
(nine). Beginning on day 2 after electroporation, tiirough day 17, 150ul of each 
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well was removed, and 150 ^1 of fresh G418 Growth Media contaimng 400 \ig/ml 
G418 was added thereto. Colonies were analyzed on day 25. 



One hundred and twenty one (121) colonies expressed anti-CD20 antibody (ie , 13 
5 colonieB per electroporation). Of these, 63 (52%) expressed over 250^g/nil of 

protein; of the 63, 20 of the colonies (16.5%) expressed over lOOO^g/ml of protein. 

Only 5 of the 121 colonies (4.1%) expressed less than 25^g/ml of protein. Figure 

5 provides a histogram comparing expression of protein per colonies derived from 

the vectors TCAE 5.2, ANEX 1 and ANEX 2. 

10 

The foregoing data indicates that, inter alia, as between ANEX 1 and ANEX 2, 
the use of at least one out-of-frame start codon upstream of a fully impaired 
consensus Kozak associated with the translation initiation of a dominant 
selectable marker decreases the number of viable colonies expressing co-Unked 
15 gene product and significantly increases the amount of expressed co-linked gene 
product. 

III. IMPACT OF DJSERTION OF GENE PRODUCT OF INTEREST 
20 WITHIN AT LEAST ONE ARTIFICIAL INTRONIC INSERTION 

REGION OF A DOMINANT SELECTABLE MARKER 

Building further upon the ANEX 2 vector, an artificial splice was generated 
between amino add residues 51 and 52 of the NEO coding region of ANEX 2, 

25 followed by insertion therein of the anti-CD20 encoding region. Two such vectors 
were generated: the first, comprising a consensus Kozak sequence for the NEO 
translation initiation codon and not comprising an out-of-frame start codon, is 
referred to as "GKNE0SPLA3P;" the second, comprising a fully impaired 
consensus Kozak and an out-of frame start condon, is referred to as 

30 "NE0SPLA3F.- 
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Both GKNE0SPLA3F and NE0SPLA3F contain the following artificial intron 

sequence between amino acid residues 51 and 52 of NEO: 

51 52 
5* CTG CAG/ GTAAGT GCGGCCGC TACTAAC (TC)3 CT (C)3 TCC CDs C CTGCAG/OAC GAG 3* 

The underlined portion represents a sequence amenable to digestion with Not I 
enzyme; the encoding region for anti-CD20, inter alia, was inserted within this 
region. 

10 Although not wishing to be boxmd by any particular theory, the inventor 
postulates that during expression, the inclusion of a gene product of interest 
within an artificial intronic insertion region of a dominant selectable marker {e,g. 
the NEO gene) should significantly decrease the number of viable colonies 
producing, in the case of the disclosed GKNE0SPA3F and NE0SPLA3F vectors, 

15 anti-CD20 antibody. This is predicated upon two points: first, only those vectors 
which are able to transcribe and correctly splice-out the antibody encoding region 
and correctly translate NEO will be G418 resistant; second, becatise each 
antibody cassette has its own promoter and polyadenylation region, transcription 
and translation of the antibody is independent of translation of NEO. 

20 

The GKNE0SPLA3P and NE0SPLA3F vectors were constructed in the following 
manner: 

Anti-CD20 in ANEX 2 was digested with Not I and Xho I in order to isolate the 
25 1503 bp NEO cassette DNA fragment (see Figure 4 between '"Not 1 7023" and 
"Xho 1 5520") as follows: 10 pi of anti-CD20 in ANEX 2 was admixed with 6 |il 
deionized H2O ("dH20"); 1 jil Not I enzyme (NEB, Prod. No. 189S); 2 ^1 of lOX 
Not I digestion bu£fer (NEB; provided with enzyme); and 1 ^ Xho I enzyme 
(Promega. Madison, WS, Prod. No. R4164). This digestion mixture was 
30 incubated overnight at 37*C. The resulting digested DNA was size fractionated 
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by 0.8% agarose gel electrophoresis and the desired fragment migrating at 1503 
was isolated via the GlassMAX^ method (Gibco BRL, Grand Island, NY, Prod. 
No. 15590-011) for insertion into pBluescript SK(0 plasmid DNA (Stratagene. La 
Jolla.CA). 

5 

pBluescript SK (-} was previously prepared for acceptance of the NEO cassette by 
double digestion with Not I and Xho I using the same conditions as above for 
anti-CD20 in ANEX 2. Digested pBluescript SK (-} was then collected by ethane) 
precipitation by the addition of 70 ^1 dHzO; 2 ^1 tRNA (Sigma, St. Louis, MO, 
10 Prod. No. R-8508); 10 jil of 3M NaOAc; and 300 jil 100% ETOH {-20*C). This was 
followed by a 10 min spin (13,000 RPM), decanting the supernatant, rinsing with 
70% ETOH, decanting the liquid, drying in a SpeedVAC™ and resuspending in 
20 ^il 1 X TE. 

15 Ligation of the NEO cassette DNA fragment into prepared pBluescript SK (0 
vector was accomplished as follows: 10 \d of NEO fragment DNA was admixed 
with 6 ^xl dHaO; 1 fil cut pBluescript SK (-) vector DNA; 2 pJ 10 x ligation buffer 
(Promega, supplied with enzyme); and 1 ^l T-4 DNA Ligase (Promega, Prod. No. 
Ml 801) followed by incubation at 14^C overnight Ligated DNA was collected by 

20 ethanol precipitation as described above for the preparation of pBluescript SK (•) 
vector DNA. 

Ten (10) ^1 of the resuspended ligated DNA was transformed into E. coli XL-1 
Blue™ (Stratagene), following manufacturer instructions. Ten (10) bacterial 
25 colonies were inoculated in LB broth (Gibco BRL, Prod. No. M27950B) including 
ampicillin (SO ^g/ml; Sigma* Prod. No. A-9393). Plasmids were isolated from the 
10 cultures vrith a Promega DNA purification system (Prod. No. PR-A7100). 
following manufacturer instructions; these plasmids may have comprised the 
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plasmid referred to as BlueNEO-*- depending on the sufficiency of the foregoing. 
(BlueNEO+ was confirmed due to the sufficiency of the following procedure.) 



BluoNSO-i- contains a Not I restriction recosnition sftouence refonnttd upon 
5 ligation of the NEO cassette fragment DNA into the pBluescript SK (•) vector. 
This site was destroyed by the following: 1 ul of BlueNEO* DNA was admixed 
with 16 ^1 dH20; 2 ^x 10 x Not I digestion buffer (NEB); 1 ^il Not I enzyme (NEB). 
This was followed by incubation at 37*C for 2 hrs. This digested DNA was then 
purified by spin column fractionation resulting in 15 pJ final volume. This 15 ^il 
10 Not I digested DNA was "blunt-ended" by admixing with 4 ^il 5X Klenow buffer 
(20 mM Tris-HCL, pH 8.0. lOOmM MgCh) and 1 DNA Polymerase I Large 
(Klenow) Fragment (Promega, Prod. No. M2201). This admixture was incubated 
at room temperature for 30 minutes. Blunt-ended DNA was then purified by 
spin column fractionation, giving a final volume of 15 ^1. 

15 

Ligation of the bltmt-ended DNA was performed in an analogous way as to the 
ligation of the NEO cassette fragment DNA into the pBluescript SK (-) vector 
except that the final DNA was resuspended in 17 jil of 1 X TE. 

20 Following ligation, the DNA was subjected to a second restriction digestion with 
Not I by mixing the 17 \d of DNA with 2 ^1 10 X Not I digestion buffer and 1 ^1 
Not I enzyme (NEB). Digestion was allowed to proceed at 37'C for 60 minutes. 
Following digestion, the admixture was pxirified by spin column fractionation 
resulting in 15 \i\ final volume. 

25 

Ten (10) ^l of the purified DNA was transformed into £. coli XL-1 Blue™ 
(Stratagene), following manufacturer instructions. Ten (10) bacterial colonies 
were inoculated in LB broth (Gibco BRL including ampidllin (50 ^ml; Sigma). 
Plasmids were isolated from the 10 cultures with a Promega DNA purification 
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system following manufacturer instructions; these plasmids may have comprised 
the plasmid referred to as BlueNEO- depending on the sufficiency of the 
foregoing. (BlueNEO- was confirmed due to the sxifficency of the following 
procedxxre.) 

5 

BlueNEO- contains a unique Pst I restriction site spanning the codons for amino 
acid residues 51 and 52. BlueNEO- was digested with Pst I as follows: an 
admixture was formed containing 15 ^l dHaO; 1 ^1 BlueNEO- DNA, 2 
digestion buffer 3 (NEB) and 2 ^1 Pst I enzyme (NEB, Prod. No. 140S). This 
10 admixture was incubate at ST^'C for 3 hrs. Digested DNA was then purified by 
spin column fractionation. The following sjmthetic oligonucleotide was then 
ligated to the Pst I cohesive ends of BlueNEO-: 

5* CXSTAAGTGCGGCCGCTACTAACICrCTC C TCC L ' it^C TlT!'^ ^ (SSQ ID NO: 12) and its 

complementary sequence: 

15 S GGAAAAAGGAGGGACGAGACAGTTAGTAGCGGCCGCACTTACCTGGA 3' (SEQ ZD KO: 13). 

Insertion of this linker creates a consensus 5* splice donor site (by ligation) 
followed by a Not I site» followed by a consensus splice branch point» followed by 
a synthetic polypyrimidine tract, followed by a consensus 3* splice acceptor site, 
as indicated above. 

20 

Ligation was performed as described above for the ligation of the NEO cassette 
into pBIuescript SK (-) except using 2 ^1 of Pst I linearized BlueNEO- DNA and 
14 M.1 (175 pmoles) of annealed complementary oligonucleotides. 

25 The foregoing (and following) synthetic oligo nucleotides were chemically 

synthesized using an Applied Biosystems 391 PGR MATE™ DNA Synthesizer 
(Applied Biosystems^ Foster City, CA). All reagents for the synthesis were 
purchased fi*om Applied Biosystems. 
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Ligated DNA was collected by ethanol precipitation as deacrib d above for the 
preparation f pBluescript SK (-) vector DNA. 



Ten (10) ^1 of the resiispended ligated DNA was transformed into £. coli XL-1 
5 Blue'^''"^ (Stratagene), following manufacturer instructions. Ten ( 10) bacterial 
colonies were inoculated in LB broth (Gibco BRL) including ampidllin (50 \xg/m\\ 
Sigma). Plasmids were isolated from the 10 cultures with a Promega DNA 
puriiication system, following manufacturer instructions; these plasmids may 
have comprised the plasmid referred to as NEOSPLA and/or NEOSPLA* 
10 depending on the sufficiency of the foregoing and the orientation of the insertion 
of the oligonucleotides. 

Determination of orientation of the splice junction linker was preformed by 
nucleic add sequendng using the Sequenase Version 2.0 DNA Sequendng Kit 
15 (United States Biochemical^ Cleveland, OH, Prod. No. 70770) following 

manufacturer instructions. Upon detennination of linker orientation within six 
independent plasmid isolates, identification of NEOSPLA was made such that 
the inserted splice junction sequences are in the correct forward orientation with 
respect to the direction of NEO transcription. 

20 

NEOSPLA was digested with Xho I by forming an admixture of 15 ^ dH20; 1 pi 
NEOSPLA DNA; 2 ^1 10 X digestion buffer D (Promega, supplied with enzyme); 
and 2 \d Xho I enzyme (Promega, Prod. No. R6161). Tlds admixture was 
digested at 37''C for 3hrs followed by DNA purifcation by spin column 
25 fractionation. Into this site was ligated a self complementary synthetic 

oligonucleotide having the following sequence: 5' TccArrAATTAAa- (SEQ ID .NOf ui 
Insertion of this sequence effectively changes the Xho I site to a Pac I restriction 
site (as underlined in SEQ ED NO: 14). 
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Ligation was performed as described above for the ligation f the NEO cassette 
into pBlu script SK (-) except using 2 ^1 of Xho I linearized MEOSPLA DNA and 
14 [H (175 pmoles) of annealed complementary oligonudeotideB. 

5 Ligated DNA was collected by ethanol precipitation as described above for the 
preparation of pBluescript SK (-) vector DNA. 

Ten (10) ^1 of the resuspended ligated DNA was transformed into £. coli XL-1 
Blue™ (Stratagene). following manufacturer instructions. Ten (10) bacterial 

10 colonies were inoculated in LB broth (Gibco BRL) including ampidllin (50 M>g^ml; 
Sigma). Plasmids were isolated from the 10 cultures with a Promega DNA 
purification system, following manufacturer instructions; these plasmids may 
have comprised the plasmid referred to as NE0SPLA3 depending on the 
sufficiency of the foregoing. (NE0SPIA3 was confirmed due to the sufficency of 

1 5 the following procedure.) 

Anti-CD20 in ANEX 2(G1 JO contains the anti-CD20 light chain and heavy chain 
immunoglobulin cassettes and a DHFR cassette botmded by a Not I site at the 5' 
end and an Xho I site at the 3' end. Anti-CD20 in ANEX 2(G1,K) was digested 

20 with Xho I by forming an admixture of 15uldH20, 1 ^1 anti-CD20 in ANEX 

2(G1,K) DNA, 2 ul 10 X digestion buffer D (Promega, supplied with enzyme) and 
2 ^il Xho I enzyme (Promega» Prod. No. R6161). This admixture was digested at 
37^C for 3 hrs followed by DNA purifcation by spin column fractionation. Into 
this site was ligated a self complementary synthetic oligonucleotide of the 

25 following sequence: s' tcqa acoccccgct r fSEo ro nq; is^. Insertion of this 
sequence effectively changes the Xho I site to a Not I restriction site (as 
underlined in SEQ ID NO: 15). 
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Ligation was performed as described above for the ligation of the NEO cassette 
into pBluescript SK (-) except using 2 nl of Xho I linearized anti-CD20 in ANEX 2 
DNA and 14 \il (175 pmoles) of annealed complementary oligonudeotid s. 



tho 



Ligated DNA was collected by othanol precipitation «i d..«rib.d «b«v for 

preparation of pBluescript SK (-) vector DNA. 

Ten (10) iH of the resuspended ligated DNA was tnmaformed into E. coli XL-l 
BlueTM (Stratagene). foUowing manufacturer instructions. Ten (10) bacterial 
colonies were inoculated in LB broth (Gibco BRL. Prod. No. M27950B) including 
ampicillin (50 w/ml; Sigma. Prod. No. A.9393). Plasmids were isolated from the 
10 cultures with a Promega DNA purification system (Prod. No. PR-A7100). 
following manufacturer instructions; these plasmids may have comprised the 
plasmid referred to as Anti-(n320 in ANEX 2(Gl.I0A depending on the 
suffiriency of the foregoing. (This was confirmed due to the sufficency of the 
following procedure.) 

Anti-CD20 in ANEX 2(G1JC)A was digested with Not I and Xho I by forming an 
admixture of 6 nl dH^O; 10 nl Anti-CD20 in ANEX 2(G1.K); 2 nl 10 x Not I 
digestion buffer (NEB, supplied with Not I enzyme); and 1 ul Not I enzyme 
(NEB). This admixture was digested at ZVC for 3 hrs followed by size 
fractionation by 0.8% agarose gel electrophoresis and the desired fragment 
migrating at 5515 base pairs by was isolated via the GlassMAX method for 
insertion into NE0SPLA3. 

NE0SPLA3 was previously prepared for acceptance of the anti.CD20 cassette by 
digestion of 1 Hi of DNA with Not I using an admixture comprising 16 ul dH20; 
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2 ^il lOX Not I digestion buflfer (NEB); 1 ^1 Not I enzyme (NEB); followed by 
incubation at 37^0 for 2 hrs. This digested DNA was then purified by spin 
column fractionation resiilting in 15 ^ final volume. 

5 Ligation of the anti-CD20 DNA fragment into prepared NE0SPLA3 vector was 
accomplished as follows: 10 ^il of anti-CD20 fragment DNA was admixed with 
6 ^il dH20; 1 ^il cut NE0SPLA3 viector DNA; 2 pi 10 x ligation buffer (Promega 
supplied with enzyme); and 1 ^ T-4 DNA Ligase (Promega); followed by 
incubation at 14°C overnight. Ligated DNA was collected by ethanol 
10 precipitation as described above for the preparation of pBluescript SK (0 vector 
DNA. 

Ten (10) of the resuspended ligated DNA was transformed into E. coli XL-1 
Blue™ (Stratagene), following manufacturer instructions. Ten (10) bacterial 

15 colonies were inoculated in LB broth (Gibco BRL) including ampidllin (50 ^g/ml; 
Sigma). Plasmids were isolated from the 10 cultures with a Promega DNA 
purification system following manufacturer instructions; these plasmids may 
have comprised the plasmids referred to as anti-CX)20 in NE0SPLA3F and anti- 
CD20 in NE0SPLA3R depending on the sufficiency of the foregoing and relative 

20 orientation of the inserted fi^gment with respect to NEO transcription. 

Determination of orientation of the anti-CD20 cassette insertion was preformed 
by double digestion with Kpnl and Spel (NEB, Prod. No. 1335) in NEB buffer 1 
plus acetated BSA as follows: an admixture comprising 4 ^1 DNA; 2 \d NEB 
25 buffer 1; Ipl Kpn I; 1 ^1 Spe I; 2 ^1 BSA; and 10 ^1 dH2b was formed. The 
admixture was digested at 37^C for 2 hrs, followed by size fractionation on an 
0.8% agarose gel electrophoresis. Upon determination of anti-CD20 insert 
orientation within six independent plasmid isolates, identification of anti-CD20 
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in NE0SPLA3F waa made such that the ins rted sequences are in the forward 
orientation with respect to the direction of NEO transcription. 



The 5515 bp anti-CD20 fraement contains the SV40 oriein, a chimeric mouse 
5 human immunoglobulin light chain transcriptional cassette, a chimeric mouse 
human immunoglubulin heavy chain transcriptional cassette, and a murine 
dihydrofolate reductase transcriptional cassette (see. Figure 4). 



Anti-CD20 in NE0SPLA3F was doubly digested with Kpn I and Stu I by creating 
10 the admixture consisting of 14 ^1 dH20, 1 ^il anti-CD20 in NE0SPLA3F, 2 ^il 10 
X digestion buffer 1 (NEB, supplied with enzyme), 2 mJ 10 x acetylated BSA (NEB 
supplied with Kpn I enzyme), 1 ^1 Kpn I enzyme, 1 \j1 Stu I enzyme (NEB. Prod. 
Nos. 142S and 187S respectively). This admixture was digested at 37**C for 3 hrs 
followed by size fractionation by 0.8% agarose gel electrophoresis and the desired 
15 fragment migrating at 9368 base pairs by was isolated via the GlassMAX 
method. 



20 



A PGR fragment of DNA was generated firom TCAE 5.2. The two following 
synthetic oligonucleotide primers were utilized in the PGR reaction: 

5 primer: 5' GCA TGC GGT ACC GGA TCG ATC GAG CTA GTA GCT TTG C 3' 
(SEQ ID NO: 16); 



3' primer 5' CTG ACT AQG OCT AGA GCG GCC GCA CTT ACC TGC ACT TCA 
25 TGC AGG GC 3' (SEQ ID NO: 17) 



The underlined portion of SEQ ID NO: 16 represents a Kpn I site, and the 
underlined portion of SEQ ID NO: 17 represents a Stu I site. 
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The PGR product was digested with Kpn I and Stu. I and then ligated into 
prepared anti-CD20 in NE0SPLA3F. 



Ligation of the 627 bp fragment into prepared anta-CD20 in NE0SPLA3F was 
accomplished as follows: 

2 nl anti-CD20 in NE0SPLA3F; 1 ^il SDS; 1 ^il tRNA (Sigma); 11 ul 3M sodium 
acetate (pH 4.5) were admixed. Following phenol/chloroform isoamyl extraction 
of the admixture, the DNA was precipitated from the aqueous phase by addition 
of 270 ^1 ethanol (ice-cold) and this was spun at 13,000 rpm for 10 min. 
Following a 70% ETOH wash, the DNA was resuspended in 16 ^ITE, 10 \d of 
PGR fragment DNA was admixed with 6 ^il dH20, 1 jil cut anti-CD20 in 
NE0SPLA3F vector DNA, 2 ^l 10 x ligation buffer (Promega, supplied with 
enzyme) and 1 ^1 T-4 DNA Ligase (Promega) followed by incubation at 14*'G 
overnight. Ligated DNA was collected by ethanol precipitation as described 
above for the preparation of pBluescript SK (•) vector DNA. 

Ten (10) \jd of the resuspended ligated DNA was transformed into £. coli XL-1 
Blue™ (Stratagene), following manufacturer instructions. Ten (10) bacterial 
colonies were inoculated in LB broth (Gibco BRL, Prod. No. M27950B) including 
ampiciUin (50 \xg/wl\ Sigma. Prod. No. A-9393). Plasmids were isolated from the 
10 cultures with a Promega DNA purification system (Prod. No. PR-A7100), 
following manufacturer instructions; these plasmids may have comprised the 
piasmid referred to as anti-CD20 in GKNE0SPLA3F depending on the 
sufBciency of the foregoing. (Confirmation was based upon sequence 
determination of the different regions of GKNE0SPLA3F vs. NE0SPLA3F.) 



The new piasmid differs from anti-CD20 in NE0SPLA3F in its Kozak sequence 
for the NEO gene which is: 

^1- 
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■3 +1 

TGT GTT GGG AGO TTG GAT CGAT cc Acc ATG Gtt 

Clal Start NEO 

5 for Anti-CD20 in GKNE0SPLA3F. and 

-3 +1 

10 TGT G CCA GC A TGG AGG aaT HGA Tec Tec ATG Ctt 
upstream Start Start NEO 

for Anti-CD20 in NE0SPLA3F. 

Comparative analysis of expression of anti-CD20 in TCAE 5 vector (comprising 
NEO with consensus Kozak); AIJEX 2 vector (comprising NEO with fully 
impaired Kozak, and upstream out-of-frame start sequence); NE0PLA3F (anti- 
CD20 inserted via artificial intronic insertion region between amino adds 51 and 
20 52 of NEO; NEO has fully impaired Kozak and an upstream out-of-frame start 
sequence); and GKNE0SPLA3F (anti-CD20 inserted via artificial intronic 
insertion region between amino acids 51 and 52 of NEO; NEO has consensus 
Kozak). 

25 Twenty-five (25) \ig of each plasmid (digested as follows: anti-CD20 in TCAE5 
and ANEX2 - Not I; anti-CD20 in NE0SPLA3F - Pac I; anti-CD20 in 
GKNE0SPLA3F - Pac I and Kpn D was electroporated into 4 x 10* CHO cells; 
these digestions were utilized to separate the genes expressed in mammalian 
cells from the DNA used to grow the plasmid in bacteria. Following digestion, 

30 EtOH precipitation of the DNA, and drying thereof, the DNA was resuspended in 
sterile TE at a concentration of 1 ^ig/jil. Electroporation conditions were as 
described in Example II, except that 230 volts was utilized and. following 
electroporation, the mixture of cells and DNA was maintained for 10 min. at 
room temperature in the sterile, disposable electroporation cuvette. 
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Following electroporation, cells were plated into 96 well dishes as shown below in 
Table I, based upon the expected frequency of G418 resistant colonies (as derived 
from preliminary experiments; data not shown): 

TABLE I 
COMPARATIVE EXPRESSION 



Plasmid 



No. Transfections No. Cells Plated No. 96 Well Plates 



TCAE5 
ANEX2 

GKNE0SPLA3F 
NE0SPLA3F 



4x10* 
2x10^ 
2xl0« 
2x10' 



5 
5 
5 
5 



10 



TABLE 1 (continued) 



Plasmid No. G418 Resistant Frequency of 0418 Resistant Colony 

Colonies per Tranafeeted Cell 



TCAE5 
ANEX2 

GKNE0SPLA3F 
NE0SPLA3F 



16 
16 

16 
16 



1 in 20.000 
1 in 100.000 
1 in 100,000 
1 in 1.000.000 



(CeUs were fed with G418 containing media on days 2, 5, 7, 9. 12, 14, 18, 22. 26, 
30 and 34; supernatant from colonies was assayed for immunoglobulin 
IS production and the colonies became confluent in the wells on days 18. 22, 26. 30 
and 34). 
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Figures 7A to 7C provide histogram results and evidence the p rcentage of 
colonies at a particular level of expression. 



The Sxamplee provided herein are not to be construed as limited to the specific 
5 vectors, fully impaired consensus Kozak sequences, dominant selectable 
markers, transcriptional cassettes, and/or expressed proteins. The fully 
impaired consensus Kozak, and the utilization thereof, are not to be construed as 
limited to ANEX 1 and ANEX 2 vectors. Similarly, the preferred fully impaired 
consensus Kozak sequences and vectors in no way constitute an admission, 

10 either actual or implied, that these are the only sequences or vectors to which the 
inventor is entitled. The inventor is entitled to the full breadth of protection 
under applicable patent laws. Preferred vectors incorporating iuUy impaired 
consensus Kozak sequences have been identified by the inventor on ANEX 1 and 
ANEX 2 for purposes of claiming these vectors by designating plaamids 

15 comprising these vectors and anti«CD20 were deposited with the American Type 
Culture Collection (ATCC). 12301 Parklawn Drive, Rockville, Maryland, 20852, 
under the provisions of the Budapest Treaty for the International Recognition of 
the Deposit of Microorganisms for the Purpose of Patent Procedure. The 
plasmids were tested by the ATCC on November 9, 1992, and determined to be 

20 viable on that date. The ATCC has assigned these plasmids the following ATCC 
deposit numbers 69120 (anti-CD20 in TCAE 12(ANEX 1}) and 69118 (anti-CD20 
in ANEX 2 (GIK)); for purposes of this deposit, these plasmids were transformed 
into E, colL 

25 Although the invention has been described in considerable detail with regard to 
certain preferred embodiments thereof, other embodiments within the scope of 
the teachings of the present invention are possible. Accordingly, neither the 
disclosure nor the claims to follow, are intended, nor should be construed to be. 
. . limited by the descriptions of the preferred embodiments contained here. 
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SEQUENCE LISTING 
(1) GENERAL INFORMATION 

(i) APPLICANT: RefiF. Mitchell E; 

(ii) TTTLS OF INVENTION: Impairsd Dgounant SsUoubU Mftrkar 

Sequence and Intronie Ineertion 
Strategies for Enhancement of « 
Expression of Gene Product and 
Expression Vector Systems Comprising 
Same 

(iii) NUMBER OF SEQUENCES: 17 

(iv) CORRESPONDING ADDRESS: 



(A) 


ADDRESSEE: 


IDEC Phannaceuticals Corporation 


(B) 


STREET: 


11011 Toireyana Road 


(C) 


CITY: 


Saa Diego 


(D) 


STATE: 


California 


(E) 


COUNTRY: 


USA 


(F) 


ZIP: 


92121 



(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette. 3.6 inch. 1.44 Mb 

(B) COMPUTER: Macintosh 

(C) OPERATING SYSTEM: MSDOS 

(D) SOFTWARE: Microsoft Word 5.0 

(vi CURRENT APPLICATION DATA: 

(A) APPUCATION NUMBER: 

(B) FILING DATE: 

(C) CLASSinCATlON: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Burgoon, Richard P. Jr. 

(B) REGISTRATION NUMBER: 34.787 

(C) REFERENCE/DOCKET NUMBER: 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (619)550-8500 
(6) TELEFAX: (619)5508750 
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10 



20 



50 



(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) - LENGTH: 17 bases 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: Unear 



(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: yes 
15 (iv) ANTI-SENSE: yes 

(ix) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

TAG CTA GGT CCT ACC CC 17 
(3) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

25 (A) LENGTH: 17 bases 

(B) TYPE: nucleic add 

(C) STRANDEDNESS; single 

(D) TOPOLOGY: linear 

30 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

ATC GAT CCT GGA TGC GG 

40 (4) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 bases 

45 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

.56- 
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(xi) SEQUENCE DESCRIPTION: SEQIDNO: 3: 
TNN ATG CTT 

(5) INFORMATION FOR SEQIDNO: 4: 

(i) SEQUSNCB CIZARACTE1IU8TI08: 

10 (A) LENGTH: 9 bases 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single, including nick 

(D) TOPOLOGY: Unear 

15 (ii) MOLECULE TYPE: (DNA (genomic) 

(iii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: no 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

CNN ATG CTT 

25 (6) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 bases 

30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



35 



(ii) MOLECULE TYPE: DNA (genomic) 
(ii) HYPOTHETICAL: yes 
(iv) ANTI-SENSE: no 
40 (xi) SEQUENCE DESCRIPTION: SEQIDNO: 5: 

TNN ATG TTT 9 
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(7) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

5 (A) LENGTH: 9 bases 

(B) TYPE: nucleic acid 

(C) STRANDEDI^SS: single 

(D) TOPOLOOY: linoar 

10 (ii) MOLECULE TYPE: DNA (genomic) 

(ii) HYPOTHETICAL: yes 
(iv) ANTI-SENSE: no 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

CNN ATG TTT ^ 

20 (8) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 bases 

25 (B) TYPE: nudeicacid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: Unear 



30 



(ii) MOLECULE TYPE: DNA (genomic) 

(ii) HYPOTHETICAL: yes 

(iv) A>rn-SENSE: no 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

TTG GGA GCT TGG ATC GAT 19 
CCA CCA TGG TT 1^ 

40 

(9) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: , 

45 (A) LENGTH: 29 bases 

(B) TYPE: nudeicacid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: Unear 

50 (ii) MOLECULE TYPE: DNA (genomic) 
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(ii) HYPOTHETICAL: no 
(iv) ANTI-SENSE: no 
5 (ri) SEQUENCE DESCRIPTION: SEQIDNO: 8: 

TTG GGA GCT TGG ATC GAT is 

COT CCA TOe rt 11 

10 (10) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 bases 

15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



20 



(u) MOLECXn^E TYPE: DNA (genomic) 

(u) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no 

25 (a) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

CCA GCA TGG AGG AAT CGA TCC 21 
TCC ATG CTT 9 

30 (11) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 bases 

35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



40 



(ii) MOLECULE TYPE: DNA (genomic) 

(ii) HYPOTHETICAL: yes 

(iv) ANTI-SENSE: yes 

45 (li) SEQUENCE DESCRIPTION: SEQIDNO: 10: 



GGA GGA TOG ATT OCT CCA TGC TGG 24 
CAC AAC TAT GTC AGA AGC AAA TGT GAG C 28 



50 
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(12) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

5 (A) LENGTH: 18 bases 

(B> TYPE: Nucleic add 

(C) STRANDEONESS: sinele 

(D) TOPOLOOY: Unaar 

10 (ii) MOLECULE TYPE: DNA (genomic) 

(ii) HYPOTHETICAL: no 
(iv) ANTI-SENSE: no 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

CTG GGG CTC GAG CTT TGC IS 

20 (13 ) INFORMATION FOR SEQ ED NO: 12 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47ba8«8 

25 (B) TYPE: Nucleic add 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 



30 



35 



40 



(ii) MOLECULE TYPE: DNA (genomic) 

(ii) HYPOTHETICAL: no 

(iv) ANTI-SENSE: no62 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

GOT AAG TGC GGC CGC TAC TAA CTC TCT CCT 30 
CCC TCC TTT TTC CTG GA 17 



( 14) INFORMATION FOR SEQ ID NO: 13 
(i) SEQUENCE CHARACTTERISTICS: 

45 (A) LENGTH: 47 bases 

(B) TYPE: Nttdeicadd 

(C) STRANDEONESS: single 

(D) TOPOLOGY: Unear 

50 (ii) MOLECULE TYPE: DNA (genomic) 

(ii) HYPOTHETICAL: no 
(iv) ANTI-SENSE: yes 

•«0- 
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10 



15 



25 



(xi) SEQUENCE DESCRIPTION: SEQIDNO: 13: 

GGA AAA AGG AGG GAG GAG AGA GTT AGT AGC GGC 33 
CGC ACT TAG CTG CA 14 

( 15) INFORMATION FOR SEQ ID NO: 14 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 bases 

(B) TYPE: Nucleic add 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: Unear 



(ii) MOLECULE TYPE: DNA (genomic) 
(ii) HYPOTHETICAL: no 
20 (iv) ANTI-SENSE: no 

(s) SEQUENCE DESCRIPTION: SEQIDNO: 14: 

TCG ATT AAT TAA 12 
( 16) INFORMATION FOR SEQ ID NO: IS 
(i) SEQUENCE CHARACTERISTICS: 



30 (A) LENGTH: 14 bases 

(B) TYPE: Nucleic add 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 (ii) MOLECULE TYPE: DNA (genomic) 

(ii) HYPOTHETICAL: no 
(iv) ANTI-SENSE: no 

40 

(n) SEQUENCE DESCRIPTION: SEQIDNO: IS: 

TCG AAG CGG CCG CT 14 

45 (17) INFORMATION FOR SEQ ID NO: 16 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 bases 

50 (B) TYPE: Nucleic add 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(u) MOLECULE TYPE: DNA (genomic) 

.61. 
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10 



15 



20 



(u) HYPOTHETICAL: no 
(iv) ANTI-SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQIDNO: 16 

GCA TGC GGT ACC GGA TCC ATC GAG CTA 27 



CTA GCT TTG C 

( 18) INFORMATION FOR SEQ ID NO: 17 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 bases 

(B) TYPE: Nucleic add 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



10 



(u) MOLECULE TYPE: DNA (genomic) 

(ii) HYPOTHETICAL: no 

25 (iv) ANTI-SENSE: no 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17 

CTG ACT AGO CCT AGA GCG GCC GCA CTT ACC 30 
30 TGC AGT TCA TCC AGG GC -17 
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CLAIMS 

What ia claimed is: 

5 1. An BKprMaloa vector Itar •«pr«««intf a prvMin of IntaMst by 

recombinant deoxyribonucleic acid techniques, said vector comprising at least 
one dominant selectable marker, wherein the translation initiation start site o(* 
said marker comprises the following sequence: 

10 -3 +1 

Pyxx ATG Pyxx 

where "Py" is a pyrimidine nucleotide; ''x".is a nucleotide; and the numerical 
designations are relative to the codon "ATG". 

15 

2. The expression vector of daim 1 wherein the nucleic add sequence 
encoding for the protein of interest is co-linked to said dominant selectable 
marker. 

20 3. The expression vector of claim 1 wherein said dominant selectable 

marker is selected from the group consisting of: herpes simplex virus thymidine 
kinase, adenosine deaminase, asparagine synthetase, Salmonella his D gene, 
xanthine guanine phosphoribosyl transferase, hygromycin B phosphotransferase, 
and neomycin phosphotransferase. 

25 

4. The expression vector of claim 1 wherein said translation initiation 
start site sequence is selected from the group consisting of TxxATGCxx; 
CxxATGCxx; CxxATGTxx; and TxxATGTxx, where "x" is a nucleotide, with the 
proviso that the codon 'Txx" downstream of the ATG codon does not encode a 
30 stop codon. 
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5. The expresaion vector of claim 1 wherein said translation initiation 
start site sequence is TxxATGCn. where "x" is a nucleotide. 



6. The expression vector of claim 1 wherein said translation initiation 
5 start site sequence is TCCATGCTT. 

7. The expression vector of claim 1 wherein said translation initiation 
start site sequence is located within a secondary structure. 

IQ 8. The expression vector of daim 1 wherein said translation initiation 

start site sequence further comprises at least one out-of-firame start codon within 
about 1000 nucleotides of the ATG start codon of said start site, with the proviso 
that no in-frame stop codon is located within said 1000 nucleotides. 

15 g. The expression vector of claim 1 wherein said translation initiation 

start site sequence further comprises at least one out-of-frame start codon within 
about 350 nucleotides of the ATG start codon of said start site, with the proviso 
that no in-frame stop codon is located within said 350 nucleotides. 

20 10. The expression vector of claim 1 wherein said translation initiation 

start site sequence further comprises at least one out-of-ft-ame start codon within 
about 50 nucleotides of the ATG start codon of said start site, with the proviso 
that no in-frame stop codon is located within said 50 nucleotides. 

25 1 1. The expression vector of claims 8, 9 and 10 wherein said out-of- 

frame start codon is part of a consensus Kozak sequence. 
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12. The expression vector of claim 10 wherein said out-of-frame start 
codon and said translation initiation start site sequence are both included as 
part of a secondary structure. 

13. The expression vector of claims 8, 9, 10 wherein said translation 
initiation start site sequence is part of a secondary structure and said out-of- 
frame start codon is not part of said secondary structure. 

14. A dominant selectable marker encoded by a nucleic add sequence, 
wherein the translation initiation start site of said dominant selectable marker is 
selected from the group consisting of TxxATGCxx; CxrATGCxx; CxxATGTxx; 
and TxxATGTxx, where '*x" is a nucleotide, with the proviso that Txx" 
downstream of the ATG oodon does not encode a stop codon. 

15. The material of claim 14 wherein said dominant selectable marker 
is selected from the group consisting of herpes simplex virus thymidine kinase, 
adenosine deaminase, asparagine synthetase, Salmonella his D gene, xanthine 
guanine phosphoribosyl transferase, hygromycin B phosphotransferase, and 
neomycin phosphotransferase, r 

16. The material of claim 14 wherein said translation initiation start 
site sequence is TxxATGCxx» where "x" is a nucleotide. 

17. The material of claim 14 wherein said translation initiation start 
site sequence is TCCATGCTT. 
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18. The material of claim 14 wherein said translation initiation start 
site sequence is located within a secondary structxire. 



19. The material of claim 14 wherein said translation initiation start 
5 site sequence further comprises at least one out-of-frame start eodon within 

about 1000 nucleotides of the ATG start codon of said start site, with the proviso 

that no in-frame stop codon is located within said 1000 nucleotides. 

20. The material of claim 14 wherein said translation initiation start 
10 site sequence further comprises at least one out-of-frame start codon within 

about 350 nucleotides of the ATG start codon of said start site, with the proviso 
that no in-frame stop codon is located within said 350 nucleotides. 

21. The material of claim 14 wherein said translation initiation start 
15 site sequence further comprises at least one out-of-frame start codon within 

about 50 nucleotides of the ATG start codon of said start site, with the proviso 
that no in-frame stop codon is located within said 50 nucleotides. 

22. The material of claimis 19, 20 and 21 wherein said out-of-frame start 
20 codon is part of a consensus Kozak sequence. 

23. The material of claim 21 wherein said out*of-frame start codon and 
said translation initiation start site sequence are both included as part of a 
secondary structure. 

25 

24. The material of claims 19, 20, 21 wherein said translation initiation 
start site sequence is part of a secondary structure and said out-of-frame start 
codon is not part of said secondary structure. 
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25. An expression vector selected from the group consisting of ANEX 1 
(included within American Type Culture Collection deposit number 69120) and 
ANEX 2 (included within ATCC deposit number 69118). 

26. A plasmid comprising the expression veotor of olaia 1 wherein tho 
nucleic add sequence encoding for said protein of interest is co-linked to said 
dominant selectable marker. 

27. The plasmid of daim 26 integrated within the cellular 
deoxyribonudeic add of a mammalian host cell. 

28. The plasmid of daim 27 wherein said host cell is selected from tho 
group consisting of DG44, DXBll. CVl, COS. R1610, SP2/0, P3x633-Ag8.653, 
BFA-lclBPT, RAJI, and 293. 

29. The expression vector of daim 1 further comprising an artifidal 
intronic insertion region within said dominant selectable marker, wherein an 
encoding sequence for a protein of interest is located vrithin said insertion region. 

30. The dominant selectable marker of claim 14 further comprising an 
artifidal instronic insertion region. 
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Neomycin phosphotransferase gene 

Cla I -3 

TCAE 5.2 TTGGGAGCTTGG ATCGAT CC A cc ATG .Q.fl 

Met Vol 

ANEX 1 TTGGGAGCTTGG ATCGAT CC Ice ATG C.tt 

Met i_eu 

ANEX 2 CCaGCATGgAGGA ATCGAT CC I cc ATG £tt 

Met Leu 

FIG. 1 
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TCAE 5.2 vs ANEX 1(TCAE12) 
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TCAE 5.2 vs ANEX 1 vs ANEX 2 
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EX 2 vs NEOSPLA 



SANEX 2 
□ NE0SPLA3F 



rsi in o o O 

d d ^ cs c>i 

I I I I A 

CN lO O 

d d d ^ 
mg/L 



FIG. 7B 



TCAE 5 vs NEC^PLA 



50" 



o 
o 



< 



^0- 
30- 
20- 
10- 




0TCAE 5 
□ NE0SPLA3F 



21 




in ^ cN lo o c o 

P o d d ^ 

o 1 j I 1 I / 

I !£J ^ cN in Q 

o 5 6 6 d - 



mg/L 



FIG. 7A 



GKNEOSPLA vs NEOSPLA 
t/i 50- 



S GKNEOSPLAJf 
ENEOSPLASr 




a: 



m '^cM iD O O o 

O o o d CN ri 

O I I I I I A 

I — CN tf) O 

° 5 6 6 6 - 



6/6 



mg/L 



